{"id":59369,"date":"2026-01-13T19:17:07","date_gmt":"2026-01-13T11:17:07","guid":{"rendered":"https:\/\/www.wsisp.com\/helps\/59369.html"},"modified":"2026-01-13T19:17:07","modified_gmt":"2026-01-13T11:17:07","slug":"%e4%bb%8e%e9%9b%b6%e6%9e%84%e5%bb%ba%e5%a4%a7%e6%a8%a1%e5%9e%8b%e8%ae%b0%e5%bd%95%e4%b8%89-%e4%bb%8e%e9%9b%b6%e5%ae%9e%e7%8e%b0-transformer-%e6%a0%b8%e5%bf%83%e6%a8%a1%e5%9d%97","status":"publish","type":"post","link":"https:\/\/www.wsisp.com\/helps\/59369.html","title":{"rendered":"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a)"},"content":{"rendered":"<p>\u63a5\u524d\u7f6e\u6587\u7ae0&#xff1a;<\/p>\n<p>\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e00)\u2014\u2014\u7406\u89e3\u5927\u8bed\u8a00\u6a21\u578b<\/p>\n<p>\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e8c)\u2014\u2014\u5904\u7406\u6587\u672c\u6570\u636e<\/p>\n<p>\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0b)\u5df2\u4e0a\u4f20<\/p>\n<p>\u672c\u6587\u5c06\u7ee7\u7eed\u8ba8\u8bba\u6ce8\u610f\u529b\u673a\u5236<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" alt=\"\" height=\"596\" src=\"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111659-696629ab52650.png\" width=\"1644\" \/><\/p>\n<h2>\u5f15\u8a00<\/h2>\n<p>\u4e3a\u4ec0\u4e48\u8981\u5728\u795e\u7ecf\u7f51\u7edc\u4e2d\u4f7f\u7528\u6ce8\u610f\u529b\u673a\u5236\u5462&#xff1f;<\/p>\n<p>\u4f20\u7edf\u7684 RNN-based \u673a\u5668\u7ffb\u8bd1\u6a21\u578b\u5728\u751f\u6210\u76ee\u6807\u8bcd\u65f6&#xff0c;\u4f9d\u8d56\u4e8e\u6709\u9650\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f&#xff08;\u5982\u6700\u540e\u4e00\u4e2a\u9690\u85cf\u72b6\u6001&#xff09;&#xff0c;\u96be\u4ee5\u6709\u6548\u5efa\u6a21\u957f\u8ddd\u79bb\u4f9d\u8d56&#xff0c;\u4e14\u8bad\u7ec3\u8fc7\u7a0b\u4e32\u884c\u3001\u6548\u7387\u4f4e\u3002\u867d\u7136\u540e\u6765\u7684 Seq2Seq&#043;Attention \u6a21\u578b\u901a\u8fc7\u5f15\u5165\u6ce8\u610f\u529b\u673a\u5236\u7f13\u89e3\u4e86\u4fe1\u606f\u74f6\u9888\u95ee\u9898&#xff0c;\u4f46\u4ecd\u53d7\u9650\u4e8e RNN \u7684\u4e32\u884c\u7ed3\u6784\u3002<\/p>\n<p>Transformer \u67b6\u6784\u63d0\u51fa\u201c\u4ec5\u4f7f\u7528\u6ce8\u610f\u529b\u673a\u5236\u201d&#xff08;Attention is All You Need&#xff09;&#xff0c;\u901a\u8fc7\u81ea\u6ce8\u610f\u529b\u673a\u5236\u8ba9\u6bcf\u4e2a\u8bcd\u5143\u5728\u7f16\u7801\u6216\u89e3\u7801\u65f6\u90fd\u80fd\u52a8\u6001\u5173\u6ce8\u5e8f\u5217\u4e2d\u6240\u6709\u5176\u4ed6\u8bcd\u5143&#xff0c;\u5e76\u6839\u636e\u8bed\u4e49\u76f8\u5173\u6027\u5206\u914d\u4e0d\u540c\u6743\u91cd\u3002\u8fd9\u4e0d\u4ec5\u663e\u8457\u63d0\u5347\u4e86\u5bf9\u957f\u8ddd\u79bb\u4f9d\u8d56\u7684\u5efa\u6a21\u80fd\u529b&#xff0c;\u8fd8\u4f7f\u5f97\u6574\u4e2a\u7f51\u7edc\u5728\u8bad\u7ec3\u9636\u6bb5\u5b8c\u5168\u53ef\u5e76\u884c\u5316&#xff0c;\u6781\u5927\u52a0\u901f\u4e86\u6a21\u578b\u8bad\u7ec3&#xff0c;\u63a8\u52a8\u4e86\u5927\u8bed\u8a00\u6a21\u578b\u7684\u53d1\u5c55\u3002<\/p>\n<p>\u4e3a\u76f4\u89c2\u7406\u89e3\u8fd9\u4e00\u673a\u5236&#xff0c;\u6211\u4eec\u4ee5\u4e00\u4e2a\u5177\u4f53\u573a\u666f\u4e3a\u4f8b&#xff1a;\u5c06\u82f1\u6587\u53e5\u5b50 \u201cThe cat sat on the mat.\u201d \u7ffb\u8bd1\u4e3a\u6cd5\u8bed \u201cLe chat s&#039;est assis sur le tapis.\u201d<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" alt=\"\" height=\"2324\" src=\"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111700-696629ac7d242.png\" width=\"2372\" \/><\/p>\n<h2>\u4e00\u4e2a\u57fa\u7840\u7684\u81ea\u6ce8\u610f\u529b\u6846\u67b6\u662f\u600e\u6837\u7684<\/h2>\n<p>\u90a3\u4e48Transformer\u5177\u4f53\u662f\u600e\u4e48\u8fd0\u884c\u7684&#xff0c;\u4e00\u4e2a\u57fa\u7840\u7684\u81ea\u6ce8\u610f\u529b\u6846\u67b6\u662f\u600e\u6837\u7684&#xff0c;\u81ea\u6ce8\u610f\u529b\u673a\u5236\u7684\u76ee\u6807\u662f\u4e3a\u6bcf\u4e2a\u8bcd\u5143\u8ba1\u7b97\u4e00\u4e2a\u4e0a\u4e0b\u6587\u5411\u91cf&#xff0c;\u4e0b\u9762\u5c06\u4ee5\u4e00\u4e2a\u4e0d\u5e26\u53ef\u8bad\u7ec3\u6743\u91cd\u7684\u57fa\u7840\u81ea\u6ce8\u610f\u529b\u6846\u67b6\u4e3a\u4f8b\u5c55\u793a\u5176\u8fd0\u884c\u8fc7\u7a0b\u3002<\/p>\n<h3>1.\u81ea\u6ce8\u610f\u529b\u7684\u6838\u5fc3\u601d\u60f3<\/h3>\n<p>\u201c\u6bcf\u4e2a\u8bcd\u90fd\u5e94\u5f53\u77e5\u9053\u53e5\u5b50\u91cc\u5176\u4ed6\u8bcd\u5728\u8bf4\u4ec0\u4e48&#xff0c;\u5e76\u6839\u636e\u76f8\u5173\u6027\u52a0\u6743\u878d\u5408\u3002\u5373\u5f97\u5230\u6bcf\u4e2a\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf\u201d<\/p>\n<p>\u4f8b\u5982&#xff0c;\u5728\u53e5\u5b50 &#034;The cat sat on the mat&#034; \u4e2d&#xff1a;<\/p>\n<ul>\n<li>\n<p>\u5f53\u5904\u7406 &#034;sat&#034; \u65f6&#xff0c;\u6a21\u578b\u5e94\u5173\u6ce8 &#034;cat&#034;&#xff08;\u4e3b\u8bed&#xff09;\u548c &#034;mat&#034;&#xff08;\u5730\u70b9&#xff09;&#xff1b;<\/p>\n<\/li>\n<li>\n<p>\u5f53\u5904\u7406 &#034;cat&#034; \u65f6&#xff0c;\u53ef\u80fd\u5173\u6ce8 &#034;The&#034;&#xff08;\u51a0\u8bcd&#xff09;\u548c &#034;sat&#034;&#xff08;\u8c13\u8bed&#xff09;\u3002<\/p>\n<\/li>\n<\/ul>\n<p>\u81ea\u6ce8\u610f\u529b\u901a\u8fc7\u67e5\u8be2&#xff08;Query&#xff09;\u3001\u952e&#xff08;Key&#xff09;\u3001\u503c&#xff08;Value&#xff09; \u673a\u5236\u5b9e\u73b0\u8fd9\u4e00\u70b9\u3002<\/p>\n<hr \/>\n<h3>2.\u5982\u4f55\u5f97\u5230\u6bcf\u4e2a\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf&#xff08;\u7b80\u5316\u6f14\u793a&#xff09;<\/h3>\n<p>tips&#xff1a;\u6b64\u90e8\u5206\u4e3a\u65e0\u53c2\u81ea\u6ce8\u610f\u529b\u7684\u7b80\u5316\u6f14\u793a&#xff0c;\u4ec5\u7528\u4e8e\u7406\u89e3\u6ce8\u610f\u529b\u7684\u52a0\u6743\u878d\u5408\u903b\u8f91&#xff0c;\u5e76\u975e Transformer \u7684\u5b9e\u9645\u5b9e\u73b0<\/p>\n<p>\u4ee5&#034;The cat sat on the mat&#034;\u4e3a\u4f8b&#xff0c;\u5c06\u5176\u5d4c\u5165\u4e3a3\u7ef4\u5411\u91cf<\/p>\n<p>tensor( [ 2.2315, -0.7460, -0.0614], #The (x^1)<br \/>\n        [-0.1235,  0.4462,  0.7265], #cat (x^2)<br \/>\n        [ 1.4900, -2.0396,  1.0440], #sat (x^3)<br \/>\n        [-0.6735, -0.5763, -0.9291], #on  (x^4)<br \/>\n        [ 0.7707,  0.5180,  0.2458], #the (x^5)<br \/>\n        [ 0.6508,  0.1164, -1.3904]] #mat (x^6)<br \/>\n        ) <\/p>\n<p>\u6bcf\u4e2a\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf\u8ba1\u7b97\u8fc7\u7a0b\u5982\u4e0b<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" alt=\"\" height=\"252\" src=\"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111704-696629b064965.png\" width=\"1764\" \/><\/p>\n<h4>2.1 \u8ba1\u7b97\u6ce8\u610f\u529b\u5206\u6570<\/h4>\n<p>\u5b9e\u73b0\u6ce8\u610f\u529b\u5206\u6570\u7684\u7b2c\u4e00\u6b65\u662f\u8ba1\u7b97\u6ce8\u610f\u529b\u5206\u6570&#xff0c;\u5373\u8ba1\u7b97\u6bcf\u4e2a\u4f4d\u7f6e i \u7684 Query \u5411\u91cf\u4e0e\u6240\u6709\u4f4d\u7f6e j \u7684 Key \u5411\u91cf\u7684\u70b9\u79ef&#xff0c;\u5f97\u5230\u6ce8\u610f\u529b\u5206\u6570 A[i,j]&#xff1b;\u56e0\u4e3a\u70b9\u79ef\u5728\u5411\u91cf\u957f\u5ea6\u76f8\u8fd1\u65f6\u53ef\u8fd1\u4f3c\u53cd\u6620\u8bed\u4e49\u76f8\u4f3c\u5ea6&#xff0c;\u5e38\u7528\u4e8e\u8861\u91cf\u4e24\u4e2a token \u7684\u76f8\u5173\u6027\u3002\u70b9\u79ef\u8d8a\u5927&#xff0c;\u8bf4\u660e\u4e24\u4e2a\u5411\u91cf\u65b9\u5411\u8d8a\u4e00\u81f4&#xff08;\u5939\u89d2\u8d8a\u5c0f&#xff09;&#xff0c;\u8bed\u4e49\u76f8\u4f3c\u5ea6\u8d8a\u9ad8&#xff0c;\u6ce8\u610f\u529b\u5206\u6570\u4e5f\u8d8a\u9ad8\u3002\u3002\u4f8b\u5b50\u4e2d\u8bcd\u5143\u4e4b\u95f4\u70b9\u79ef\u8ba1\u7b97\u7ed3\u679c\u4e3a&#xff1a;<\/p>\n<p>tensor([[ 5.5401, -0.6531,  4.7825, -1.0159,  1.3183,  1.4509], #The\u4e0e\u5176\u4ed6\u8bcd\u5143\u95f4\u7684\u6ce8\u610f\u529b\u5206\u6570&#xff0c;\u8fd9\u4e9b\u6570\u7531\u4e8e\u672a\u7ecf\u8bad\u7ec3&#xff0c;\u662f\u968f\u673a\u751f\u6210\u7684<br \/>\n        [-0.6531,  0.7421, -0.3357, -0.8489,  0.3145, -1.0385],<br \/>\n        [ 4.7825, -0.3357,  7.4699, -0.7981,  0.3484, -0.7192],<br \/>\n        [-1.0159, -0.8489, -0.7981,  1.6490, -1.0459,  0.7864],<br \/>\n        [ 1.3183,  0.3145,  0.3484, -1.0459,  0.9227,  0.2202],<br \/>\n        [ 1.4509, -1.0385, -0.7192,  0.7864,  0.2202,  2.3702]],<br \/>\n       grad_fn&#061;&lt;MmBackward0&gt;)) <\/p>\n<h4>2.2 \u6ce8\u610f\u529b\u5206\u6570\u5f52\u4e00\u5316<\/h4>\n<p>\u5c06\u6ce8\u610f\u529b\u5206\u6570\u8fdb\u884c\u5f52\u4e00\u5316&#xff0c;\u5f52\u4e00\u5316\u540e\u7ed3\u679c\u5982\u4e0b&#xff1a;<\/p>\n<p>Attention Weight 2:  tensor([[6.6504e-01, 1.3588e-03, 3.1176e-01, 9.4538e-04, 9.7572e-03, 1.1141e-02], #\u5355\u8bcdThe\u5f52\u4e00\u5316\u540e\u5206\u6570&#xff0c;\u6c42\u548c\u4e3a1<br \/>\n        [9.4845e-02, 3.8278e-01, 1.3028e-01, 7.7982e-02, 2.4960e-01, 6.4515e-02],<br \/>\n        [6.3619e-02, 3.8089e-04, 9.3475e-01, 2.3987e-04, 7.5490e-04, 2.5956e-04],<br \/>\n        [4.0282e-02, 4.7603e-02, 5.0084e-02, 5.7868e-01, 3.9091e-02, 2.4426e-01],<br \/>\n        [3.5132e-01, 1.2875e-01, 1.3320e-01, 3.3033e-02, 2.3654e-01, 1.1716e-01],<br \/>\n        [2.2166e-01, 1.8390e-02, 2.5308e-02, 1.1406e-01, 6.4745e-02, 5.5584e-01]],<br \/>\n       grad_fn&#061;&lt;SoftmaxBackward0&gt;)<br \/>\nAttention Weight 2 Sum: tensor([1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000],<br \/>\n       grad_fn&#061;&lt;SumBackward1&gt;) <\/p>\n<h4>2.3 \u8ba1\u7b97\u4e0a\u4e0b\u6587\u5411\u91cf<\/h4>\n<p>\u901a\u8fc7\u5c06\u5d4c\u5165\u8bcd\u5143\u4e0e\u76f8\u5e94\u7684\u6ce8\u610f\u529b\u6743\u91cd\u76f8\u4e58&#xff0c;\u5c06\u5f97\u5230\u7684\u5411\u91cf\u6c42\u548c\u8ba1\u7b97\u4e0a\u4e0b\u6587\u5411\u91cf&#xff0c;\u7ed3\u679c\u5982\u4e0b<\/p>\n<p>tensor([[ 1.9626, -1.1256,  0.2716], #The\u4e0e\u6ce8\u610f\u529b\u6743\u91cd\u76f8\u4e58\u6c42\u548c\u5f97\u5230\u7684\u4e0a\u4e0b\u6587\u5411\u91cf&#xff0c;\u8fd9\u4e2a\u4e0a\u4e0b\u6587\u5411\u91cf\u5305\u542b\u4e86\u6ce8\u610f\u529b\u5206\u6570\u4fe1\u606f&#xff0c;\u56e0\u6b64\u643a\u5e26\u8be5\u8bcd\u8bed\u4f4d\u7f6e\u7684\u5411\u91cf\u90fd\u878d\u5408\u4e86\u5168\u53e5\u4fe1\u606f\u3002<br \/>\n        [ 0.5403, -0.0738,  0.3074],<br \/>\n        [ 1.5353, -1.9535,  0.9718],<br \/>\n        [-0.0420, -0.3958, -0.7833],<br \/>\n        [ 1.2028, -0.3592,  0.0755],<br \/>\n        [ 0.8649, -0.1763, -0.8367]], grad_fn&#061;&lt;MmBackward0&gt;) <\/p>\n<h4>2.4 \u4ee3\u7801\u793a\u4f8b<\/h4>\n<p>&#034;&#034;&#034;<br \/>\n\u5982\u4f55\u5f97\u5230\u6bcf\u4e2a\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf&#xff08;\u65e0\u53c2\u81ea\u6ce8\u610f\u529b\u7b80\u5316\u6f14\u793a&#xff09;<br \/>\n&#034;&#034;&#034;<\/p>\n<p>import torch<br \/>\nimport tiktoken<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 1. \u5206\u8bcd\u4e0e\u5d4c\u5165 &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# \u521d\u59cb\u5316\u5206\u8bcd\u5668&#xff08;GPT2&#xff09;<br \/>\ntokenizer &#061; tiktoken.get_encoding(&#039;gpt2&#039;)<br \/>\ntext &#061; &#034;The cat sat on the mat&#034;<br \/>\ntoken_ids &#061; tokenizer.encode(text)  # \u5206\u8bcd\u7ed3\u679c: [1169, 398, 6356, 286, 262, 1369]<br \/>\nprint(f&#034;\u5206\u8bcd\u540e\u7684token_ids: {token_ids}, shape: {torch.tensor(token_ids).shape}&#034;)  # shape: [6]<\/p>\n<p># \u5b9a\u4e49\u5d4c\u5165\u5c42\u53c2\u6570<br \/>\nvocab_size &#061; tokenizer.n_vocab  # GPT2\u8bcd\u6c47\u8868\u5927\u5c0f<br \/>\nembedding_dim &#061; 3  # \u81ea\u5b9a\u4e49\u5d4c\u5165\u7ef4\u5ea6&#xff08;\u4ec5\u6f14\u793a\u7528&#xff09;<br \/>\nembedding_layer &#061; torch.nn.Embedding(<br \/>\n    num_embeddings&#061;vocab_size,<br \/>\n    embedding_dim&#061;embedding_dim<br \/>\n)<\/p>\n<p># \u5c06 token IDs \u8f6c\u4e3a PyTorch \u5f20\u91cf<br \/>\ninput_ids &#061; torch.tensor(token_ids)  # shape: [seq_len] &#061; [6]<\/p>\n<p># \u83b7\u53d6\u5d4c\u5165\u5411\u91cf&#xff1a;\u5c06\u79bb\u6563token\u6620\u5c04\u4e3a\u8fde\u7eed\u5411\u91cf<br \/>\ninputs &#061; embedding_layer(input_ids)  # shape: [seq_len, embedding_dim] &#061; [6, 3]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; \u8f93\u5165\u5d4c\u5165\u5411\u91cf &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;inputs shape:&#034;, inputs.shape)  # \u8f93\u51fa: torch.Size([6, 3])<br \/>\nprint(&#034;inputs:\\\\n&#034;, inputs)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 2. \u8ba1\u7b97\u6ce8\u610f\u529b\u5206\u6570&#xff08;\u70b9\u79ef&#xff09; &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# \u70b9\u79ef\u6ce8\u610f\u529b\u5206\u6570&#xff1a;\u6bcf\u4e2a\u8bcd\u5143\u4e0e\u6240\u6709\u8bcd\u5143\u7684\u76f8\u4f3c\u5ea6<br \/>\n# inputs (6,3) &#064; inputs.T (3,6) \u2192 attn_scores (6,6)<br \/>\nattn_scores &#061; inputs &#064; inputs.T  # shape: [seq_len, seq_len] &#061; [6, 6]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; \u6ce8\u610f\u529b\u5206\u6570 &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;attn_scores shape:&#034;, attn_scores.shape)  # \u8f93\u51fa: torch.Size([6, 6])<br \/>\nprint(&#034;attn_scores:\\\\n&#034;, attn_scores)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 3. \u6ce8\u610f\u529b\u5206\u6570\u5f52\u4e00\u5316&#xff08;Softmax&#xff09; &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# dim&#061;-1: \u5bf9\u6700\u540e\u4e00\u7ef4&#xff08;\u6bcf\u4e2a\u8bcd\u5143\u76846\u4e2a\u5206\u6570&#xff09;\u505a\u5f52\u4e00\u5316&#xff0c;\u4fdd\u8bc1\u6bcf\u884c\u548c\u4e3a1<br \/>\nattn_weights &#061; torch.softmax(attn_scores, dim&#061;-1)  # shape: [6, 6]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; \u5f52\u4e00\u5316\u6ce8\u610f\u529b\u6743\u91cd &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;Attention Weights shape:&#034;, attn_weights.shape)  # \u8f93\u51fa: torch.Size([6, 6])<br \/>\nprint(&#034;Attention Weights:\\\\n&#034;, attn_weights)<br \/>\n# \u9a8c\u8bc1\u6bcf\u884c\u548c\u4e3a1&#xff08;\u5f52\u4e00\u5316\u6548\u679c&#xff09;<br \/>\nprint(&#034;Attention Weights Sum (\u6bcf\u884c\u548c):\\\\n&#034;, attn_weights.sum(dim&#061;-1))  # shape: [6]&#xff0c;\u503c\u5168\u4e3a1.0<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 4. \u8ba1\u7b97\u4e0a\u4e0b\u6587\u5411\u91cf&#xff08;\u52a0\u6743\u6c42\u548c&#xff09; &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# attn_weights (6,6) &#064; inputs (6,3) \u2192 all_context_vecs (6,3)<br \/>\nall_context_vecs &#061; attn_weights &#064; inputs  # shape: [seq_len, embedding_dim] &#061; [6, 3]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; \u6700\u7ec8\u4e0a\u4e0b\u6587\u5411\u91cf &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;Context Vectors shape:&#034;, all_context_vecs.shape)  # \u8f93\u51fa: torch.Size([6, 3])<br \/>\nprint(&#034;Context Vectors:\\\\n&#034;, all_context_vecs) <\/p>\n<h3>3.\u4e3a\u81ea\u6ce8\u610f\u529b\u673a\u5236\u6dfb\u52a0\u53ef\u8bad\u7ec3\u6743\u91cd<\/h3>\n<p>\u4e0a\u8ff0\u6f14\u793a\u4e86\u81ea\u6ce8\u610f\u529b\u6743\u91cd\u7684\u8ba1\u7b97\u8fc7\u7a0b&#xff0c;\u867d\u7136\u76f4\u89c2&#xff0c;\u4f46\u7f3a\u4e4f\u5b66\u4e60\u80fd\u529b\u3002<\/p>\n<p>\u771f\u5b9e\u6a21\u578b\u901a\u8fc7\u5f15\u5165\u53ef\u8bad\u7ec3\u7684 Q\/K\/V \u6295\u5f71\u77e9\u9635&#xff0c;\u4f7f\u6ce8\u610f\u529b\u673a\u5236\u80fd\u591f\u6839\u636e\u4efb\u52a1\u52a8\u6001\u8c03\u6574\u5173\u6ce8\u91cd\u70b9\u3002\u201d&#xff1b;\u5728\u771f\u5b9e\u7684 Transformer \u4e2d&#xff08;\u5982 GPT&#xff09;&#xff1a;\u5d4c\u5165\u5c42\u662f\u901a\u8fc7\u5927\u91cf\u6587\u672c\u8bad\u7ec3\u5f97\u5230\u7684&#xff1b;\u201cThe\u201d \u548c \u201ccat\u201d \u4f1a\u56e0\u7ecf\u5e38\u5171\u73b0\u800c\u5d4c\u5165\u5411\u91cf\u9760\u8fd1&#xff1b;\u81ea\u6ce8\u610f\u529b\u4f1a\u771f\u6b63\u5b66\u4f1a&#xff1a;\u751f\u6210 \u201ccat\u201d \u65f6\u5173\u6ce8 \u201cThe\u201d&#xff0c;\u751f\u6210 \u201csat\u201d \u65f6\u5173\u6ce8 \u201ccat\u201d \u548c \u201cmat\u201d\u3002\u4f46\u5728\u968f\u673a\u521d\u59cb\u5316\u9636\u6bb5&#xff0c;\u6a21\u578b\u5c1a\u672a\u5b66\u4e60\u5230\u4efb\u4f55\u8bed\u4e49\u6a21\u5f0f&#xff0c;\u6ce8\u610f\u529b\u6743\u91cd\u63a5\u8fd1\u968f\u673a\u5206\u5e03\u3002\u90a3\u4e48\u81ea\u6ce8\u610f\u529b\u673a\u5236\u5982\u4f55\u8fdb\u884c\u8bad\u7ec3\u5462&#xff0c;\u9996\u5148\u9700\u8981\u4e3a\u81ea\u6ce8\u610f\u529b\u673a\u5236\u6dfb\u52a0\u53ef\u8bad\u7ec3\u6743\u91cd&#xff0c;\u901a\u8fc7\u5f15\u5165\u53ef\u8bad\u7ec3\u6743\u91cd&#xff0c;\u53ef\u4ee5\u4f7f\u6a21\u578b\u5b66\u4f1a\u4ea7\u51fa\u201c\u597d\u7684\u201d\u4e0a\u4e0b\u6587\u5411\u91cf\u3002<\/p>\n<p>\u81ea\u6ce8\u610f\u529b\u7684\u6838\u5fc3\u53ef\u5b66\u4e60\u53c2\u6570\u4e3a W_Q\u3001W_K\u3001W_V \u548c\u8f93\u51fa\u6295\u5f71\u77e9\u9635\u3002\u5728\u7aef\u5230\u7aef\u8bad\u7ec3\u4e2d&#xff0c;\u8fd9\u4e9b\u53c2\u6570\u901a\u8fc7\u53cd\u5411\u4f20\u64ad\u8fed\u4ee3\u66f4\u65b0&#xff0c;\u6700\u7ec8\u8ba9\u6a21\u578b\u5177\u5907*\u4efb\u52a1\u5bfc\u5411\u7684\u52a8\u6001\u805a\u7126\u80fd\u529b\u2014\u2014 \u5373\u9488\u5bf9\u4e0d\u540c\u4efb\u52a1\u548c\u8f93\u5165&#xff0c;\u81ea\u52a8\u8bc6\u522b\u6700\u76f8\u5173\u7684\u8bcd\u5143\u3002<\/p>\n<p>W_Q, W_K, W_V\u53ef\u4ee5\u7c7b\u6bd4\u4e3a\u53ef\u5b66\u4e60\u7684\u201c\u8f6f\u201d\u6570\u636e\u5e93\u68c0\u7d22\u3002W_K\u7c7b\u6bd4\u4e3a\u6570\u636e\u5e93\u4e2d\u7684\u7d22\u5f15\/\u952e&#xff08;Index\/Key&#xff09;&#xff1b;W_V\u7c7b\u6bd4\u4e3a\u6570\u636e\u5e93\u4e2d\u7684\u5b9e\u9645\u6570\u636e\/\u8bb0\u5f55&#xff08;Value\/Record&#xff09;W_Q\u7c7b\u6bd4\u4e3a\u7528\u6237\u53d1\u8d77\u7684\u67e5\u8be2\u8bf7\u6c42&#xff08;Query&#xff09;&#xff1b;Attention Score\u7c7b\u6bd4\u4e3a\u76f8\u4f3c\u5ea6\u5339\u914d\u5206\u6570&#xff1b;Output\u7c7b\u6bd4\u4e3a\u52a0\u6743\u68c0\u7d22\u7ed3\u679c\u3002<\/p>\n<p>\u4ee5&#034;The cat sat on the mat&#034;\u4e3a\u4f8b&#xff0c;\u5c06\u5176\u5d4c\u5165\u4e3a3\u7ef4\u5411\u91cf<\/p>\n<p>tensor( [ 2.2315, -0.7460, -0.0614], #The (x^1)<br \/>\n        [-0.1235,  0.4462,  0.7265], #cat (x^2)<br \/>\n        [ 1.4900, -2.0396,  1.0440], #sat (x^3)<br \/>\n        [-0.6735, -0.5763, -0.9291], #on  (x^4)<br \/>\n        [ 0.7707,  0.5180,  0.2458], #the (x^5)<br \/>\n        [ 0.6508,  0.1164, -1.3904]] #mat (x^6)<br \/>\n        ) <\/p>\n<p>\u6dfb\u52a0\u53ef\u8bad\u7ec3\u6743\u91cd\u7684\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf\u8ba1\u7b97\u8fc7\u7a0b\u5982\u4e0b&#xff1a;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" alt=\"\" height=\"496\" src=\"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111704-696629b0ea90b.png\" width=\"2246\" \/><\/p>\n<h4>3.1 \u521d\u59cb\u5316\u6743\u91cd\u77e9\u9635<\/h4>\n<p>\u521d\u59cb\u53163\u4e2a\u6743\u91cd\u77e9\u9635&#xff1a;\u7528\u4e8e\u5c06\u539f\u59cb\u5d4c\u5165\u6295\u5f71\u5230\u4e09\u4e2a\u4e0d\u540c\u7684\u8bed\u4e49\u5b50\u7a7a\u95f4\u3002\u8fd9\u4e9b\u77e9\u9635\u662f\u6a21\u578b\u53c2\u6570&#xff0c;\u5728\u540e\u7eed\u8bad\u7ec3\u4e2d\u901a\u8fc7\u53cd\u5411\u4f20\u64ad\u66f4\u65b0<\/p>\n<p>W_query Parameter containing:<br \/>\ntensor([[0.2961, 0.5166],<br \/>\n        [0.2517, 0.6886],<br \/>\n        [0.0740, 0.8665]]) # shape (3, 2)<br \/>\nW_key Parameter containing:<br \/>\ntensor([[0.1366, 0.1025],<br \/>\n        [0.1841, 0.7264],<br \/>\n        [0.3153, 0.6871]]) # shape (3, 2)<br \/>\nW_value Parameter containing:<br \/>\ntensor([[0.0756, 0.1966],<br \/>\n        [0.3164, 0.4017],<br \/>\n        [0.1186, 0.8274]]) # shape (3, 2) <\/p>\n<h4>3.2 \u8ba1\u7b97\u67e5\u8be2\u5411\u91cf\u3001\u952e\u5411\u91cf\u3001\u503c\u5411\u91cf<\/h4>\n<p>\u4ee5\u7b2c\u4e8c\u4e2a\u8bcd\u5143cat\u4e3a\u4f8b&#xff0c;\u901a\u8fc7\u77e9\u9635\u4e58\u6cd5\u8ba1\u7b97\u67e5\u8be2\u5411\u91cf\u3001\u952e\u5411\u91cf\u548c\u503c\u5411\u91cf&#xff0c;\u5047\u8bbe\u8f93\u5165\u5d4c\u5165\u7ef4\u5ea6\u4e3a3&#xff0c;\u8f93\u51fa\u5d4c\u5165\u7ef4\u5ea6\u4e3a2\u3002\u8ba1\u7b97\u793a\u4f8b&#xff1a;\u7b2c\u4e00\u884c\u7b2c\u4e00\u5217&#061; (-0.1235* 0.2961) &#043; (0.4462 * 0.2517) &#043; (0.7265 * 0.0740)\u2248 -0.0366 &#043; 0.1123 &#043; 0.0538 \u2248 0.1295&#xff0c;\u4ee5\u6b64\u7c7b\u63a8\u3002\u8ba1\u7b97\u7ed3\u679c\u4ee3\u8868cat\u5728\u5f53\u524d\u4e0a\u4e0b\u6587\u4e2d\u53d1\u51fa\u7684\u201c\u67e5\u8be2\u8bf7\u6c42\u201d&#xff0c;\u5b83\u60f3\u77e5\u9053\u54ea\u4e9b\u8bcd\u4e0e\u81ea\u5df1\u76f8\u5173<\/p>\n<p>query_2 tensor([0.1295, 0.8730])<br \/>\nkey_2 tensor([0.2943, 0.8107])<br \/>\nvalue_2 tensor([0.2180, 0.7561]) <\/p>\n<h4>3.3 \u8ba1\u7b97\u6240\u6709\u8bcd\u5143\u952e\u5411\u91cf\u548c\u503c\u5411\u91cf<\/h4>\n<p>\u5f97\u5230\u6240\u6709\u8f93\u5165\u8bcd\u5143\u7684\u952e\u5411\u91cf\u548c\u503c\u5411\u91cf&#xff0c;\u867d\u7136\u8ba1\u7b97\u7684\u662fcat\u7684\u4e0a\u4e0b\u6587\u5411\u91cf&#xff0c;\u4f46\u662f\u4ecd\u7136\u9700\u8981\u6240\u6709\u8bcd\u5143\u7684\u503c\u5411\u91cf\u548c\u952e\u5411\u91cf&#xff0c;\u56e0\u4e3a\u5b83\u4eec\u53c2\u4e0e\u4e86\u8ba1\u7b97\u76f8\u5bf9\u4e8ecat\u7684\u6ce8\u610f\u529b\u6743\u91cd&#xff1b;\u53ef\u4ee5\u901a\u8fc7\u77e9\u9635\u4e58\u6cd5\u8ba1\u7b97\u6240\u6709\u8bcd\u5143\u7684\u7684\u503c\u5411\u91cf\u548c\u952e\u5411\u91cf&#xff0c;\u8ba1\u7b97\u793a\u4f8b&#xff1a;\u7b2c\u4e00\u884c\u7b2c\u4e00\u52172.2315\u00d70.1366&#043;(\u22120.7460)\u00d70.1841&#043;(\u22120.0614)\u00d70.3153k1(1)&#061;2.2315\u00d70.1366&#043;(\u22120.7460)\u00d70.1841&#043;(\u22120.0614)\u00d70.3153&#061;0.1481&#xff08;\u7b2c\u4e00\u4e2aThe\u7684\u5d4c\u5165\u5411\u91cf\u4e0eW_K\u505a\u77e9\u9635\u4e58\u6cd5&#xff09;\u3002\u7ed3\u679c\u5982\u4e0b&#xff1a;<\/p>\n<p>keys tensor([[ 0.1481, -0.3554], #cat\u5bf9\u6240\u6709\u8f93\u5165\u539f\u5c5e\u7684\u952e\u5411\u91cf<br \/>\n        [ 0.2943,  0.8107],<br \/>\n        [ 0.1572, -0.6116],<br \/>\n        [-0.4910, -1.1261],<br \/>\n        [ 0.2781,  0.6242],<br \/>\n        [-0.3280, -0.8041]])<br \/>\nvalues tensor([[-0.0745,  0.0883], #cat\u5bf9\u6240\u6709\u8f93\u5165\u539f\u5c5e\u7684\u503c\u5411\u91cf<br \/>\n        [ 0.2180,  0.7561],<br \/>\n        [-0.4089,  0.3374],<br \/>\n        [-0.3435, -1.1327],<br \/>\n        [ 0.2513,  0.5630],<br \/>\n        [-0.0788, -0.9757]]) <\/p>\n<h4>3.4 \u8ba1\u7b97\u6ce8\u610f\u529b\u5206\u6570<\/h4>\n<p>\u8ba1\u7b97cat\u7684\u6ce8\u610f\u529b\u5206\u6570&#xff08;\u672a\u7f29\u653e&#xff09;&#xff0c;\u4f7f\u7528cat\u4f5c\u4e3a\u67e5\u8be2\u8bcd\u5143\u548c\u7b2c3\u6b65\u7684\u952e\u5411\u91cf\u8ba1\u7b97\u6ce8\u610f\u529b\u5206\u6570&#xff0c;\u8ba1\u7b97\u793a\u4f8b&#xff1a;\u4ee5cat\u4e3a\u4f8b&#xff0c;\u6ce8\u610f\u529b\u5206\u6570&#061;0.1295\u00d70.2943 &#043; 0.8730\u00d70.8107 \u2248 0.0381 &#043; 0.7078 \u2248 0.7458<\/p>\n<p>attn_scores_2: tensor([-0.2911,  0.7458, -0.5136, -1.0466,  0.5809, -0.7444]) #cat\u4e0e\u952e\u5411\u91cf\u4e4b\u95f4\u70b9\u51fb\u70b9\u79ef\u8ba1\u7b97\u5f97\u5230\u672a\u7f29\u653e\u7684\u6ce8\u610f\u529b\u5206\u6570<br \/>\n#\u5206\u6790\u53ef\u77e5&#xff1a;<br \/>\n#\u201ccat\u201d \u4e0e\u81ea\u5df1&#xff08;i&#061;2&#xff09;\u5f97\u5206\u6700\u9ad8&#xff08;0.7458&#xff09;\u2192 \u81ea\u5173\u6ce8\u5f88\u5e38\u89c1<br \/>\n#\u4e0e \u201cthe\u201d&#xff08;i&#061;5&#xff09;\u5f97\u5206\u6b21\u9ad8&#xff08;0.5809&#xff09;\u2192 \u53ef\u80fd\u56e0\u51a0\u8bcd-\u540d\u8bcd\u5171\u73b0<br \/>\n#\u4e0e \u201csat\u201d&#xff08;i&#061;3&#xff09;\u5f97\u5206\u5f88\u4f4e&#xff08;-0.5136&#xff09;\u2192 \u5f53\u524d\u6743\u91cd\u4e0b\u672a\u6355\u6349\u4e3b\u8c13\u5173\u7cfb&#xff08;\u56e0\u4e3a\u672a\u8bad\u7ec3&#xff01;&#xff09; <\/p>\n<h4>3.5 \u6ce8\u610f\u529b\u5206\u6570\u8f6c\u4e3a\u6ce8\u610f\u529b\u6743\u91cd<\/h4>\n<p>\u5c06\u6ce8\u610f\u529b\u5206\u6570\u8f6c\u6362\u4e3a\u6ce8\u610f\u529b\u6743\u91cd&#xff0c;\u901a\u8fc7\u7f29\u653e\u6ce8\u610f\u529b\u5206\u6570&#xff08;\u901a\u8fc7\u5d4c\u5165\u7ef4\u5ea6\u7684\u5e73\u65b9\u6839\u8fdb\u884c\u7f29\u653e\u53ef\u4ee5\u9632\u6b62\u5d4c\u5165\u7ef4\u5ea6\u592a\u5927\u5f71\u54cd\u8bad\u7ec3\u6548\u7387&#xff0c;\u5f53\u5d4c\u5165\u7ef4\u5ea6\u8f83\u5927\u65f6&#xff0c;\u70b9\u79ef\u7684\u65b9\u5dee\u4f1a\u589e\u5927&#xff0c;\u5bfc\u81f4 softmax \u8fdb\u5165\u9971\u548c\u533a&#xff08;\u68af\u5ea6\u63a5\u8fd1 0&#xff09;&#xff0c;\u7f29\u653e\u53ef\u7f13\u89e3\u6b64\u95ee\u9898&#xff0c;\u6240\u4ee5\u53eb\u7f29\u653e\u70b9\u79ef\u6ce8\u610f\u529b\u673a\u5236&#xff09;\u5e76\u4f7f\u7528softmax\u8ba1\u7b97\u6ce8\u610f\u529b\u6743\u91cd&#xff0c;\u8ba1\u7b97\u793a\u4f8b&#xff1a;\u4ee5cat\u6ce8\u610f\u529b\u5206\u6570\u7f29\u653e\u4e3a\u4f8b&#xff0c;\u7f29\u653e\u5206\u6570&#061;0.7458\/\u6839\u53f72&#061;0.527&#xff0c;\u4e4b\u540e\u8fdb\u884csoftmax\u5f52\u4e00\u5316\u8bad\u7ec3&#xff0c;Softmax \u5c06\u5206\u6570\u8f6c\u6362\u4e3a\u6982\u7387\u5206\u5e03&#xff1a;\u6bcf\u4e2a\u6743\u91cd &#061; exp(score) \/ \u6240\u6709 exp(score) \u4e4b\u548c\u3002<\/p>\n<p>\u6700\u7ec8\u5f97\u5230\u6743\u91cd\u5f62\u5f0f\u5982\u4e0b&#xff1a;<\/p>\n<p>attn_weights_2: tensor([0.1408, 0.2932, 0.1203, 0.0825, 0.2609, 0.1022])<br \/>\n#\u8fd9\u4e9b\u6743\u91cd\u5b8c\u5168\u7531\u5f53\u524d W_Q\/W_K \u51b3\u5b9a\u3002\u5982\u679c\u6a21\u578b\u7ecf\u8fc7\u8bad\u7ec3&#xff0c;W \u4f1a\u8c03\u6574\u4f7f\u5f97\u201csat\u201d\u5728\u751f\u6210\u65f6\u66f4\u5173\u6ce8\u201ccat\u201d&#xff0c;\u4f46\u73b0\u5728\u662f\u968f\u673a\u7684\u3002 <\/p>\n<h4>3.6 \u8ba1\u7b97\u4e0a\u4e0b\u6587\u5411\u91cf<\/h4>\n<p>\u901a\u8fc7\u5c06\u6ce8\u610f\u529b\u6743\u91cd\u4f5c\u4e3a\u52a0\u6743\u56e0\u5b50\u5bf9\u503c\u5411\u91cf\u8fdb\u884c\u52a0\u6743\u6c42\u548c\u6765\u8ba1\u7b97&#xff0c;\u7528\u4e8e\u8861\u91cf\u6bcf\u4e2a\u503c\u5411\u91cf\u7684\u91cd\u8981\u6027&#xff0c;\u8ba1\u7b97\u793a\u4f8b&#xff0c;0.1408\u00d7(\u22120.0745)&#043;0.2932\u00d70.2180&#043;0.1203\u00d7(\u22120.4089)&#043;0.0825\u00d7(\u22120.3435)&#043;0.2609\u00d70.2513&#043;0.1022\u00d7(\u22120.0788)&#061;0.0334\u3002\u4e0a\u4e0b\u6587\u5411\u91cf\u5f62\u5f0f\u5982\u4e0b&#xff1a;<\/p>\n<p>context_vec_2 tensor([0.0334, 0.2284])<br \/>\n#\u8fd9\u5c31\u662f \u201ccat\u201d \u7684\u4e0a\u4e0b\u6587\u611f\u77e5\u8868\u793a&#xff1a;<br \/>\n#\u5b83\u4e0d\u518d\u53ea\u662f\u539f\u59cb\u5d4c\u5165 [-0.1235, 0.4462, 0.7265]\u800c\u662f\u878d\u5408\u4e86\u5168\u53e5\u4fe1\u606f\u76842D\u5411\u91cf&#xff0c;\u540e\u7eed\u4f1a\u9001\u5165 FFN \u6216\u4e0b\u4e00\u5c42&#xff0c;\u8f93\u51fa\u7ef4\u5ea6\u662f\u53ef\u4ee5\u6539\u53d8\u7684&#xff0c;\u4e0d\u4e00\u5b9a\u56fa\u5b9a\u4e3a2 <\/p>\n<h4>3.7 \u4ee3\u7801\u793a\u4f8b<\/h4>\n<p>&#034;&#034;&#034;<br \/>\n\u5f15\u5165\u53ef\u8bad\u7ec3\u7684Q\/K\/V\u6743\u91cd\u77e9\u9635\u7684\u81ea\u6ce8\u610f\u529b\u673a\u5236&#xff08;\u5b8c\u6574\u7248\u672c&#xff09;<br \/>\n\u4ee3\u7801\u4ec5\u4ee5&#039;cat&#039;\u8bcd\u5143&#xff08;\u7b2c2\u4e2atoken&#xff09;\u4e3a\u4f8b\u6f14\u793a&#xff0c;\u5b9e\u9645\u4e2d\u6240\u6709\u8bcd\u5143\u5e76\u884c\u8ba1\u7b97<br \/>\n&#034;&#034;&#034;<\/p>\n<p>import torch<br \/>\ntorch.manual_seed(123)  # \u56fa\u5b9a\u968f\u673a\u79cd\u5b50&#xff0c;\u4fdd\u8bc1\u7ed3\u679c\u53ef\u590d\u73b0<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 1. \u521d\u59cb\u5316\u8f93\u5165\u5d4c\u5165&#xff08;\u6a21\u62df\u503c&#xff09; &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# \u6a21\u62df6\u4e2a\u8bcd\u5143\u76843\u7ef4\u5d4c\u5165\u5411\u91cf&#xff0c;\u5bf9\u5e94\u53e5\u5b50&#xff1a;The cat sat on the mat<br \/>\ninputs &#061; torch.tensor(<br \/>\n [[2.2315, -0.7460, -0.0614],  # The (token 0)<br \/>\n  [-0.1235, 0.4462, 0.7265],  # cat (token 1&#xff0c;\u91cd\u70b9\u6f14\u793a\u6b64\u8bcd\u5143)<br \/>\n  [1.4900, -2.0396, 1.0440],  # sat (token 2)<br \/>\n  [-0.6735, -0.5763, -0.9291],  # on (token 3)<br \/>\n  [0.7707, 0.5180, 0.2458],  # the (token 4)<br \/>\n  [0.6508, 0.1164, -1.3904]]  # mat (token 5)<br \/>\n)  # shape: [seq_len, d_in] &#061; [6, 3]<br \/>\nprint(&#034;\u8f93\u5165\u5d4c\u5165 inputs shape:&#034;, inputs.shape)  # \u8f93\u51fa: torch.Size([6, 3])<\/p>\n<p># \u9009\u53d6\u7b2c2\u4e2a\u8bcd\u5143&#xff08;cat&#xff09;\u4f5c\u4e3a\u6f14\u793a\u7684\u67e5\u8be2\u8bcd\u5143<br \/>\nx_2 &#061; inputs[1]  # shape: [d_in] &#061; [3]<br \/>\nprint(&#034;\\\\ncat\u8bcd\u5143\u7684\u539f\u59cb\u5d4c\u5165 x_2 shape:&#034;, x_2.shape)  # \u8f93\u51fa: torch.Size([3])<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 2. \u5b9a\u4e49\u7ef4\u5ea6\u4e0e\u521d\u59cb\u5316Q\/K\/V\u6743\u91cd\u77e9\u9635 &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\nd_in &#061; inputs.shape[1]  # \u8f93\u5165\u5d4c\u5165\u7ef4\u5ea6: 3<br \/>\nd_out &#061; 2  # Q\/K\/V\u8f93\u51fa\u7ef4\u5ea6: 2&#xff08;\u53ef\u81ea\u5b9a\u4e49&#xff0c;\u901a\u5e38d_out&#061;d_model\/h&#xff0c;h\u4e3a\u6ce8\u610f\u529b\u5934\u6570&#xff09;<\/p>\n<p># \u521d\u59cb\u5316\u53ef\u8bad\u7ec3\u6743\u91cd\u77e9\u9635&#xff08;\u5b9e\u9645\u8bad\u7ec3\u65f6requires_grad&#061;True&#xff09;<br \/>\n# W_query: [d_in, d_out] &#061; [3, 2]<br \/>\nW_query &#061; torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad&#061;False)<br \/>\n# W_key: [d_in, d_out] &#061; [3, 2]<br \/>\nW_key &#061; torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad&#061;False)<br \/>\n# W_value: [d_in, d_out] &#061; [3, 2]<br \/>\nW_value &#061; torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad&#061;False)<\/p>\n<p>print(&#034;\\\\n&#061;&#061;&#061; Q\/K\/V\u6743\u91cd\u77e9\u9635 &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;W_query shape:&#034;, W_query.shape, &#034;\\\\n&#034;, W_query)<br \/>\nprint(&#034;W_key shape:&#034;, W_key.shape, &#034;\\\\n&#034;, W_key)<br \/>\nprint(&#034;W_value shape:&#034;, W_value.shape, &#034;\\\\n&#034;, W_value)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 3. \u8ba1\u7b97\u5355\u4e2a\u8bcd\u5143&#xff08;cat&#xff09;\u7684Q\/K\/V &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# query_2: x_2 (3) &#064; W_query (3,2) \u2192 [2]<br \/>\nquery_2 &#061; x_2 &#064; W_query  # shape: [d_out] &#061; [2]<br \/>\n# key_2: x_2 (3) &#064; W_key (3,2) \u2192 [2]<br \/>\nkey_2 &#061; x_2 &#064; W_key      # shape: [d_out] &#061; [2]<br \/>\n# value_2: x_2 (3) &#064; W_value (3,2) \u2192 [2]<br \/>\nvalue_2 &#061; x_2 &#064; W_value  # shape: [d_out] &#061; [2]<\/p>\n<p>print(&#034;\\\\n&#061;&#061;&#061; cat\u8bcd\u5143\u7684Q\/K\/V\u5411\u91cf &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;query_2 shape:&#034;, query_2.shape, &#034;\\\\n&#034;, query_2)<br \/>\nprint(&#034;key_2 shape:&#034;, key_2.shape, &#034;\\\\n&#034;, key_2)<br \/>\nprint(&#034;value_2 shape:&#034;, value_2.shape, &#034;\\\\n&#034;, value_2)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 4. \u8ba1\u7b97\u6240\u6709\u8bcd\u5143\u7684K\/V\u77e9\u9635 &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# keys: inputs (6,3) &#064; W_key (3,2) \u2192 (6,2)<br \/>\nkeys &#061; inputs &#064; W_key     # shape: [seq_len, d_out] &#061; [6, 2]<br \/>\n# values: inputs (6,3) &#064; W_value (3,2) \u2192 (6,2)<br \/>\nvalues &#061; inputs &#064; W_value # shape: [seq_len, d_out] &#061; [6, 2]<\/p>\n<p>print(&#034;\\\\n&#061;&#061;&#061; \u6240\u6709\u8bcd\u5143\u7684K\/V\u77e9\u9635 &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;keys shape:&#034;, keys.shape, &#034;\\\\n&#034;, keys)<br \/>\nprint(&#034;values shape:&#034;, values.shape, &#034;\\\\n&#034;, values)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 5. \u8ba1\u7b97cat\u8bcd\u5143\u7684\u6ce8\u610f\u529b\u5206\u6570&#xff08;\u672a\u7f29\u653e&#xff09; &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# attn_scores_2: query_2 (2) &#064; keys.T (2,6) \u2192 (6)<br \/>\nattn_scores_2 &#061; query_2 &#064; keys.T  # shape: [seq_len] &#061; [6]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; cat\u8bcd\u5143\u7684\u672a\u7f29\u653e\u6ce8\u610f\u529b\u5206\u6570 &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;attn_scores_2 shape:&#034;, attn_scores_2.shape, &#034;\\\\n&#034;, attn_scores_2)<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 6. \u7f29\u653e&#043;Softmax\u5f97\u5230\u6ce8\u610f\u529b\u6743\u91cd &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\nd_k &#061; keys.shape[-1]  # \u952e\u5411\u91cf\u7ef4\u5ea6: 2<br \/>\n# \u7f29\u653e&#xff1a;\u5206\u6570 \/ \u221ad_k&#xff0c;\u9632\u6b62\u70b9\u79ef\u503c\u8fc7\u5927\u5bfc\u81f4Softmax\u9971\u548c<br \/>\nscaled_scores_2 &#061; attn_scores_2 \/ (d_k ** 0.5)  # shape: [6]<br \/>\n# Softmax\u5f52\u4e00\u5316\u5f97\u5230\u6ce8\u610f\u529b\u6743\u91cd<br \/>\nattn_weights_2 &#061; torch.softmax(scaled_scores_2, dim&#061;-1)  # shape: [6]<\/p>\n<p>print(&#034;\\\\n&#061;&#061;&#061; cat\u8bcd\u5143\u7684\u6ce8\u610f\u529b\u6743\u91cd &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;attn_weights_2 shape:&#034;, attn_weights_2.shape, &#034;\\\\n&#034;, attn_weights_2)<br \/>\nprint(&#034;attn_weights_2 sum:&#034;, attn_weights_2.sum())  # \u9a8c\u8bc1\u548c\u4e3a1.0<\/p>\n<p># &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061; 7. \u8ba1\u7b97cat\u8bcd\u5143\u7684\u4e0a\u4e0b\u6587\u5411\u91cf &#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;&#061;<br \/>\n# context_vec_2: attn_weights_2 (6) &#064; values (6,2) \u2192 (2)<br \/>\ncontext_vec_2 &#061; attn_weights_2 &#064; values  # shape: [d_out] &#061; [2]<br \/>\nprint(&#034;\\\\n&#061;&#061;&#061; cat\u8bcd\u5143\u7684\u6700\u7ec8\u4e0a\u4e0b\u6587\u5411\u91cf &#061;&#061;&#061;&#034;)<br \/>\nprint(&#034;context_vec_2 shape:&#034;, context_vec_2.shape, &#034;\\\\n&#034;, context_vec_2) <\/p>\n<h2>\u603b\u7ed3<\/h2>\n<p>\u672c\u6587\u7cfb\u7edf\u4ecb\u7ecd\u4e86\u81ea\u6ce8\u610f\u529b\u673a\u5236\u7684\u57fa\u672c\u539f\u7406\u4e0e\u5b9e\u73b0\u6d41\u7a0b&#xff0c;\u5e76\u6784\u5efa\u4e86\u4e00\u4e2a\u7b80\u5316\u7684\u81ea\u6ce8\u610f\u529b\u6846\u67b6&#xff0c;\u4ece\u65e0\u53c2\u7248\u672c\u5230\u5f15\u5165\u53ef\u8bad\u7ec3\u6743\u91cd\u7684\u5b8c\u6574\u5f62\u5f0f&#xff0c;\u9010\u6b65\u63ed\u793a\u4e86\u5176\u5982\u4f55\u4e3a\u6bcf\u4e2a\u8bcd\u5143\u751f\u6210\u4e0a\u4e0b\u6587\u611f\u77e5\u7684\u8868\u793a\u3002<\/p>\n<p>\u7136\u800c&#xff0c;\u57fa\u7840\u81ea\u6ce8\u610f\u529b\u4ecd\u6709\u5c40\u9650&#xff1a;<\/p>\n<ul>\n<li>\n<p>\u7f3a\u4e4f\u56e0\u679c\u7ea6\u675f&#xff1a;\u5728\u8bed\u8a00\u751f\u6210\u4efb\u52a1\u4e2d&#xff0c;\u6a21\u578b\u4e0d\u80fd\u201c\u5077\u770b\u201d\u672a\u6765\u8bcd&#xff1b;<\/p>\n<\/li>\n<li>\n<p>\u5355\u5934\u8868\u8fbe\u80fd\u529b\u6709\u9650&#xff1a;\u5355\u4e00\u6ce8\u610f\u529b\u5934\u53ef\u80fd\u65e0\u6cd5\u6355\u6349\u591a\u6837\u5316\u7684\u8bed\u4e49\u5173\u7cfb&#xff1b;<\/p>\n<\/li>\n<li>\n<p>\u6613\u8fc7\u62df\u5408&#xff1a;\u5c24\u5176\u5728\u5c0f\u6570\u636e\u96c6\u4e0a&#xff0c;\u9ad8\u7ef4\u6ce8\u610f\u529b\u6743\u91cd\u53ef\u80fd\u5bfc\u81f4\u6cdb\u5316\u80fd\u529b\u4e0b\u964d\u3002<\/p>\n<\/li>\n<\/ul>\n<p>\u4e3a\u6b64&#xff0c;\u73b0\u4ee3\u5927\u8bed\u8a00\u6a21\u578b\u5f15\u5165\u4e86\u591a\u9879\u589e\u5f3a\u8bbe\u8ba1&#xff1a;<\/p>\n<li>\n<p>\u56e0\u679c\u6ce8\u610f\u529b&#xff08;Causal Attention&#xff09;&#xff1a;\u901a\u8fc7\u63a9\u7801\u673a\u5236\u786e\u4fdd\u89e3\u7801\u65f6\u4ec5\u5173\u6ce8\u5df2\u751f\u6210\u7684\u8bcd\u5143&#xff0c;\u4fdd\u969c\u81ea\u56de\u5f52\u751f\u6210\u7684\u5408\u7406\u6027&#xff1b;<\/p>\n<\/li>\n<li>\n<p>\u591a\u5934\u6ce8\u610f\u529b&#xff08;Multi-Head Attention&#xff09;&#xff1a;\u5e76\u884c\u4f7f\u7528\u591a\u4e2a\u6ce8\u610f\u529b\u5934&#xff0c;\u5206\u522b\u5b66\u4e60\u4e0d\u540c\u5b50\u7a7a\u95f4\u7684\u8bed\u4e49\u5173\u7cfb&#xff08;\u5982\u8bed\u6cd5\u3001\u6307\u4ee3\u3001\u4e3b\u9898\u7b49&#xff09;&#xff0c;\u518d\u878d\u5408\u8f93\u51fa&#xff0c;\u663e\u8457\u63d0\u5347\u6a21\u578b\u8868\u8fbe\u80fd\u529b&#xff1b;<\/p>\n<\/li>\n<li>\n<p>\u6ce8\u610f\u529b Dropout&#xff1a;\u5728\u8bad\u7ec3\u65f6\u968f\u673a\u7f6e\u96f6\u90e8\u5206\u6ce8\u610f\u529b\u6743\u91cd&#xff0c;\u9632\u6b62\u6a21\u578b\u8fc7\u5ea6\u4f9d\u8d56\u7279\u5b9a\u8bcd\u5bf9&#xff0c;\u589e\u5f3a\u9c81\u68d2\u6027\u3002<\/p>\n<\/li>\n<p>\u8fd9\u4e9b\u6539\u8fdb\u5171\u540c\u6784\u6210\u4e86 Transformer \u7684\u5f3a\u5927\u57fa\u7840&#xff0c;\u4f7f\u5176\u4e0d\u4ec5\u80fd\u9ad8\u6548\u5904\u7406\u957f\u5e8f\u5217&#xff0c;\u8fd8\u80fd\u5728\u6d77\u91cf\u6587\u672c\u4e2d\u5b66\u4e60\u590d\u6742\u7684\u8bed\u8a00\u89c4\u5f8b\u3002\u4ece\u201cAttention is All You Need\u201d\u5230\u5982\u4eca\u7684\u5343\u4ebf\u53c2\u6570\u5927\u6a21\u578b&#xff0c;\u81ea\u6ce8\u610f\u529b\u673a\u5236\u59cb\u7ec8\u662f\u9a71\u52a8\u81ea\u7136\u8bed\u8a00\u5904\u7406\u8fdb\u6b65\u7684\u6838\u5fc3\u5f15\u64ce\u3002<\/p>\n<\/p>\n<p>\u540e\u7eed\u6539\u8fdb\u5c06\u5728\u4e0b\u4e00\u7bc7\u6587\u7ae0\u4e2d\u7ee7\u7eed\u5c55\u5f00\u4ecb\u7ecd&#xff0c;\u6b22\u8fce\u5927\u5bb6\u70b9\u8d5e\u6536\u85cf\u5173\u6ce8&#xff0c;\u4e00\u8d77\u4ea4\u6d41\u5b66\u4e60&#xff01;<\/p>\n<p>\u4e0b\u4e00\u7bc7\u6587\u7ae0\u5df2\u4e0a\u4f20&#xff1a;\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0b)<\/p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u63a5\u524d\u7f6e\u6587\u7ae0&#xff1a;<br \/>\n\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e00)\u2014\u2014\u7406\u89e3\u5927\u8bed\u8a00\u6a21\u578b<br \/>\n\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e8c)\u2014\u2014\u5904\u7406\u6587\u672c\u6570\u636e<br \/>\n\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0b)\u5df2\u4e0a\u4f20<br \/>\n\u672c\u6587\u5c06\u7ee7\u7eed\u8ba8\u8bba\u6ce8\u610f\u529b\u673a\u5236 \u5f15\u8a00<br \/>\n\u4e3a\u4ec0\u4e48\u8981\u5728\u795e\u7ecf\u7f51\u7edc\u4e2d\u4f7f\u7528\u6ce8\u610f\u529b\u673a\u5236\u5462&#xff1f;<br \/>\n\u4f20\u7edf\u7684 RNN-based \u673a\u5668\u7ffb\u8bd1\u6a21\u578b\u5728\u751f\u6210\u76ee\u6807\u8bcd\u65f6&#xff0c;\u4f9d\u8d56\u4e8e\u6709\u9650\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f&#xff08;\u5982\u6700\u540e\u4e00\u4e2a\u9690\u85cf\u72b6\u6001&#xff09;&#xff0c;\u96be\u4ee5\u6709\u6548\u5efa\u6a21\u957f\u8ddd\u79bb<\/p>\n","protected":false},"author":2,"featured_media":59365,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[841,50,86],"topic":[],"class_list":["post-59369","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-server","tag-transformer","tag-50","tag-86"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.wsisp.com\/helps\/59369.html\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\" \/>\n<meta property=\"og:description\" content=\"\u63a5\u524d\u7f6e\u6587\u7ae0&#xff1a; \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e00)\u2014\u2014\u7406\u89e3\u5927\u8bed\u8a00\u6a21\u578b \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e8c)\u2014\u2014\u5904\u7406\u6587\u672c\u6570\u636e \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0b)\u5df2\u4e0a\u4f20 \u672c\u6587\u5c06\u7ee7\u7eed\u8ba8\u8bba\u6ce8\u610f\u529b\u673a\u5236 \u5f15\u8a00 \u4e3a\u4ec0\u4e48\u8981\u5728\u795e\u7ecf\u7f51\u7edc\u4e2d\u4f7f\u7528\u6ce8\u610f\u529b\u673a\u5236\u5462&#xff1f; \u4f20\u7edf\u7684 RNN-based \u673a\u5668\u7ffb\u8bd1\u6a21\u578b\u5728\u751f\u6210\u76ee\u6807\u8bcd\u65f6&#xff0c;\u4f9d\u8d56\u4e8e\u6709\u9650\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f&#xff08;\u5982\u6700\u540e\u4e00\u4e2a\u9690\u85cf\u72b6\u6001&#xff09;&#xff0c;\u96be\u4ee5\u6709\u6548\u5efa\u6a21\u957f\u8ddd\u79bb\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.wsisp.com\/helps\/59369.html\" \/>\n<meta property=\"og:site_name\" content=\"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\" \/>\n<meta property=\"article:published_time\" content=\"2026-01-13T11:17:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111659-696629ab52650.png\" \/>\n<meta name=\"author\" content=\"admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/59369.html\",\"url\":\"https:\/\/www.wsisp.com\/helps\/59369.html\",\"name\":\"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\",\"isPartOf\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/#website\"},\"datePublished\":\"2026-01-13T11:17:07+00:00\",\"dateModified\":\"2026-01-13T11:17:07+00:00\",\"author\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/59369.html#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.wsisp.com\/helps\/59369.html\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/59369.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u9996\u9875\",\"item\":\"https:\/\/www.wsisp.com\/helps\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#website\",\"url\":\"https:\/\/www.wsisp.com\/helps\/\",\"name\":\"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\",\"description\":\"\u9999\u6e2f\u670d\u52a1\u5668_\u9999\u6e2f\u4e91\u670d\u52a1\u5668\u8d44\u8baf_\u670d\u52a1\u5668\u5e2e\u52a9\u6587\u6863_\u670d\u52a1\u5668\u6559\u7a0b\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.wsisp.com\/helps\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41\",\"name\":\"admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery\",\"contentUrl\":\"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery\",\"caption\":\"admin\"},\"sameAs\":[\"http:\/\/wp.wsisp.com\"],\"url\":\"https:\/\/www.wsisp.com\/helps\/author\/admin\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.wsisp.com\/helps\/59369.html","og_locale":"zh_CN","og_type":"article","og_title":"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","og_description":"\u63a5\u524d\u7f6e\u6587\u7ae0&#xff1a; \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e00)\u2014\u2014\u7406\u89e3\u5927\u8bed\u8a00\u6a21\u578b \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bfb\u4e66\u8bb0\u5f55(\u4e8c)\u2014\u2014\u5904\u7406\u6587\u672c\u6570\u636e \u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0b)\u5df2\u4e0a\u4f20 \u672c\u6587\u5c06\u7ee7\u7eed\u8ba8\u8bba\u6ce8\u610f\u529b\u673a\u5236 \u5f15\u8a00 \u4e3a\u4ec0\u4e48\u8981\u5728\u795e\u7ecf\u7f51\u7edc\u4e2d\u4f7f\u7528\u6ce8\u610f\u529b\u673a\u5236\u5462&#xff1f; \u4f20\u7edf\u7684 RNN-based \u673a\u5668\u7ffb\u8bd1\u6a21\u578b\u5728\u751f\u6210\u76ee\u6807\u8bcd\u65f6&#xff0c;\u4f9d\u8d56\u4e8e\u6709\u9650\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f&#xff08;\u5982\u6700\u540e\u4e00\u4e2a\u9690\u85cf\u72b6\u6001&#xff09;&#xff0c;\u96be\u4ee5\u6709\u6548\u5efa\u6a21\u957f\u8ddd\u79bb","og_url":"https:\/\/www.wsisp.com\/helps\/59369.html","og_site_name":"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","article_published_time":"2026-01-13T11:17:07+00:00","og_image":[{"url":"https:\/\/www.wsisp.com\/helps\/wp-content\/uploads\/2026\/01\/20260113111659-696629ab52650.png"}],"author":"admin","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"admin","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"7 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.wsisp.com\/helps\/59369.html","url":"https:\/\/www.wsisp.com\/helps\/59369.html","name":"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a) - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","isPartOf":{"@id":"https:\/\/www.wsisp.com\/helps\/#website"},"datePublished":"2026-01-13T11:17:07+00:00","dateModified":"2026-01-13T11:17:07+00:00","author":{"@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41"},"breadcrumb":{"@id":"https:\/\/www.wsisp.com\/helps\/59369.html#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.wsisp.com\/helps\/59369.html"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.wsisp.com\/helps\/59369.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u9996\u9875","item":"https:\/\/www.wsisp.com\/helps"},{"@type":"ListItem","position":2,"name":"\u4ece\u96f6\u6784\u5efa\u5927\u6a21\u578b\u8bb0\u5f55(\u4e09)\u2014\u2014\u4ece\u96f6\u5b9e\u73b0 Transformer \u6838\u5fc3\u6a21\u5757(\u4e0a)"}]},{"@type":"WebSite","@id":"https:\/\/www.wsisp.com\/helps\/#website","url":"https:\/\/www.wsisp.com\/helps\/","name":"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","description":"\u9999\u6e2f\u670d\u52a1\u5668_\u9999\u6e2f\u4e91\u670d\u52a1\u5668\u8d44\u8baf_\u670d\u52a1\u5668\u5e2e\u52a9\u6587\u6863_\u670d\u52a1\u5668\u6559\u7a0b","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.wsisp.com\/helps\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"zh-Hans"},{"@type":"Person","@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41","name":"admin","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/image\/","url":"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery","contentUrl":"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery","caption":"admin"},"sameAs":["http:\/\/wp.wsisp.com"],"url":"https:\/\/www.wsisp.com\/helps\/author\/admin"}]}},"_links":{"self":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts\/59369","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/comments?post=59369"}],"version-history":[{"count":0,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts\/59369\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/media\/59365"}],"wp:attachment":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/media?parent=59369"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/categories?post=59369"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/tags?post=59369"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/topic?post=59369"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}