{"id":70610,"date":"2026-02-02T05:05:53","date_gmt":"2026-02-01T21:05:53","guid":{"rendered":"https:\/\/www.wsisp.com\/helps\/70610.html"},"modified":"2026-02-02T05:05:53","modified_gmt":"2026-02-01T21:05:53","slug":"%e5%a6%82%e4%bd%95%e5%9c%a8-rhel-8-%e4%b8%8a%e9%85%8d%e7%bd%ae%e5%b9%b6%e4%bc%98%e5%8c%96-nvidia-cuda-11%ef%bc%8c%e5%9c%a8%e6%98%be%e5%8d%a1%e6%9c%8d%e5%8a%a1%e5%99%a8%e4%b8%8a%e5%8a%a0%e9%80%9f-ai","status":"publish","type":"post","link":"https:\/\/www.wsisp.com\/helps\/70610.html","title":{"rendered":"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f"},"content":{"rendered":"<p>\u5728\u5927\u89c4\u6a21\u5728\u7ebf\u63a8\u8350\u7cfb\u7edf\u4e2d&#xff0c;\u5b9e\u65f6\u63a8\u7406\u7684\u6027\u80fd\u76f4\u63a5\u5f71\u54cd\u7528\u6237\u4f53\u9a8c\u548c\u4e1a\u52a1\u8f6c\u5316\u6548\u7387\u3002\u4f20\u7edf CPU \u63a8\u7406\u5728\u9ad8\u5e76\u53d1\u3001\u4f4e\u5ef6\u8fdf\u573a\u666f\u4e0b\u5f80\u5f80\u96be\u4ee5\u6ee1\u8db3\u5b9e\u65f6 SLA&#xff08;\u5982 10ms \u5185\u54cd\u5e94&#xff09;\u3002\u5229\u7528 GPU \u52a0\u901f\u63a8\u7406&#xff0c;\u5c24\u5176\u662f\u91c7\u7528 NVIDIA CUDA \u751f\u6001&#xff08;\u5982 cuBLAS\u3001cuDNN\u3001TensorRT&#xff09;&#xff0c;\u53ef\u4ee5\u6781\u5927\u63d0\u5347\u63a8\u7406\u541e\u5410\u548c\u54cd\u5e94\u901f\u5ea6\u3002A5\u6570\u636e\u4ee5 Red Hat Enterprise Linux 8&#xff08;RHEL 8&#xff09; \u4e3a\u64cd\u4f5c\u7cfb\u7edf&#xff0c;\u56f4\u7ed5 NVIDIA CUDA 11 \u7684\u5b8c\u6574\u90e8\u7f72\u3001\u7cfb\u7edf\u7ea7\u8c03\u4f18\u4e0e\u63a8\u8350\u6a21\u578b\u5b9e\u65f6\u63a8\u7406\u4f18\u5316\u5c55\u5f00&#xff0c;\u7ed3\u5408\u5177\u4f53\u786c\u4ef6\u53c2\u6570\u3001\u7cfb\u7edf\u914d\u7f6e\u3001\u4ee3\u7801\u793a\u4f8b\u4e0e\u6027\u80fd\u8868\u683c&#xff0c;\u5f62\u6210\u53ef\u843d\u5730\u7684\u9ad8\u8d28\u91cf\u6280\u672f\u6307\u5357\u3002<\/p>\n<p>\u672c\u6587\u9002\u5408\u4ee5\u4e0b\u8bfb\u8005&#xff1a;<\/p>\n<ul>\n<li>\u8981\u5728 RHEL 8 \u670d\u52a1\u5668\u4e0a\u90e8\u7f72 GPU \u52a0\u901f\u63a8\u7406\u670d\u52a1\u7684\u8fd0\u7ef4\/\u7814\u53d1\u5de5\u7a0b\u5e08<\/li>\n<li>\u5e0c\u671b\u6df1\u5165\u7406\u89e3 CUDA 11 \u4e0e\u63a8\u8350\u63a8\u7406\u6808\u8c03\u4f18\u7ec6\u8282\u7684\u6280\u672f\u8d1f\u8d23\u4eba<\/li>\n<li>\u9700\u8981\u5c06 TensorFlow\/PyTorch \u6a21\u578b\u96c6\u6210\u81f3\u9ad8\u541e\u5410\u4f4e\u5ef6\u8fdf\u63a8\u7406\u6846\u67b6\u7684\u5f00\u53d1\u8005<\/li>\n<\/ul>\n<hr \/>\n<h3>\u9999\u6e2fGPU\u670d\u52a1\u5668www.a5idc.com\u786c\u4ef6\u4e0e\u7cfb\u7edf\u73af\u5883<\/h3>\n<p>\u5728\u5f00\u59cb\u914d\u7f6e\u524d&#xff0c;\u660e\u786e\u76ee\u6807\u786c\u4ef6\u4e0e\u7cfb\u7edf\u7248\u672c\u662f\u6210\u529f\u4f18\u5316\u7684\u57fa\u7840\u3002<\/p>\n<table>\n<tr>\u7ec4\u4ef6\u5177\u4f53\u578b\u53f7 \/ \u7248\u672c<\/tr>\n<tbody>\n<tr>\n<td>\u64cd\u4f5c\u7cfb\u7edf<\/td>\n<td>Red Hat Enterprise Linux 8.7<\/td>\n<\/tr>\n<tr>\n<td>\u5185\u6838\u7248\u672c<\/td>\n<td>4.18.0-372.el8.x86_64<\/td>\n<\/tr>\n<tr>\n<td>CPU<\/td>\n<td>2\u00d7 Intel Xeon Gold 6338 (32 \u6838 &#064; 2.0GHz)<\/td>\n<\/tr>\n<tr>\n<td>\u5185\u5b58<\/td>\n<td>512 GB DDR4<\/td>\n<\/tr>\n<tr>\n<td>GPU<\/td>\n<td>4\u00d7 NVIDIA A100 Tensor Core GPU&#xff08;40GB HBM2&#xff09;<\/td>\n<\/tr>\n<tr>\n<td>NVIDIA \u9a71\u52a8<\/td>\n<td>460.73.01<\/td>\n<\/tr>\n<tr>\n<td>CUDA Toolkit<\/td>\n<td>CUDA 11.8<\/td>\n<\/tr>\n<tr>\n<td>cuDNN<\/td>\n<td>cuDNN 8.4<\/td>\n<\/tr>\n<tr>\n<td>TensorRT<\/td>\n<td>TensorRT 8.5<\/td>\n<\/tr>\n<tr>\n<td>\u7f51\u7edc<\/td>\n<td>100GbE \u5185\u7f51&#xff08;RDMA \u652f\u6301 RoCE v2&#xff09;<\/td>\n<\/tr>\n<tr>\n<td>\u6587\u4ef6\u7cfb\u7edf<\/td>\n<td>XFS on NVMe SSD<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\u8bf4\u660e&#xff1a; A100 \u662f\u9762\u5411 AI \u63a8\u7406\u4e0e\u8bad\u7ec3\u7684\u4e3b\u6d41 GPU&#xff0c;\u652f\u6301\u5927\u89c4\u6a21\u63a8\u8350\u6a21\u578b\u7a00\u758f\u4e0e\u5bc6\u96c6\u8ba1\u7b97\u52a0\u901f\u3002\u672c\u6587\u793a\u4f8b\u4f7f\u7528 CUDA 11 \u7cfb\u5217\u6700\u65b0\u517c\u5bb9\u7248\u672c&#xff08;11.8&#xff09;&#xff0c;\u4e0e RHEL 8 \u9a71\u52a8\u517c\u5bb9\u6027\u6700\u4f73\u3002<\/p>\n<hr \/>\n<h3>\u4e00\u3001\u51c6\u5907\u5de5\u4f5c&#xff1a;\u5b89\u88c5 NVIDIA \u9a71\u52a8\u4e0e CUDA 11<\/h3>\n<h4>1. \u7981\u7528 Nouveau \u9a71\u52a8<\/h4>\n<p>NVIDIA \u5b98\u65b9\u9a71\u52a8\u9700\u5148\u7981\u7528 Nouveau \u5185\u6838\u6a21\u5757&#xff1a;<\/p>\n<p><span class=\"token function\">cat<\/span> <span class=\"token operator\">&lt;&lt;<\/span><span class=\"token string\">EOF<span class=\"token bash punctuation\"> <span class=\"token operator\">&gt;<\/span> \/etc\/modprobe.d\/blacklist-nouveau.conf<\/span><br \/>\nblacklist nouveau<br \/>\noptions nouveau modeset&#061;0<br \/>\nEOF<\/span><br \/>\ndracut &#8211;force<br \/>\n<span class=\"token function\">reboot<\/span><\/p>\n<p>\u786e\u8ba4 Nouveau \u5df2\u7981\u7528&#xff1a;<\/p>\n<p>lsmod <span class=\"token operator\">|<\/span> <span class=\"token function\">grep<\/span> nouveau<\/p>\n<p>\u8f93\u51fa\u4e3a\u7a7a\u8868\u793a\u6210\u529f\u7981\u7528\u3002<\/p>\n<hr \/>\n<h4>2. \u5b89\u88c5 NVIDIA \u9a71\u52a8<\/h4>\n<p>\u4ece NVIDIA \u5b98\u65b9\u4e0b\u8f7d\u517c\u5bb9 CUDA 11 \u7684\u9a71\u52a8\u5b89\u88c5\u5305&#xff08;\u5982 NVIDIA-Linux-x86_64-460.73.01.run&#xff09;&#xff0c;\u5e76\u6267\u884c&#xff1a;<\/p>\n<p><span class=\"token function\">chmod<\/span> &#043;x NVIDIA-Linux-x86_64-460.73.01.run<br \/>\n.\/NVIDIA-Linux-x86_64-460.73.01.run &#8211;silent<\/p>\n<p>\u9a8c\u8bc1\u9a71\u52a8\u5b89\u88c5&#xff1a;<\/p>\n<p>nvidia-smi<\/p>\n<p>\u9884\u671f\u8f93\u51fa\u793a\u4f8b&#xff1a;<\/p>\n<p>&#043;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;&#043;<br \/>\n| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2    |<br \/>\n&#8230;<br \/>\n| A100-SXM4-40GB        0  P0    40C    38W \/ 250W | 40506MiB \/ 40506MiB |<br \/>\n&#043;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;&#043;<\/p>\n<hr \/>\n<h4>3. \u5b89\u88c5 CUDA Toolkit 11.8<\/h4>\n<p>\u9009\u62e9 .rpm \u7f51\u7edc\u5b89\u88c5\u65b9\u5f0f&#xff1a;<\/p>\n<p>dnf config-manager &#8211;add-repo https:\/\/developer.download.nvidia.com\/compute\/cuda\/repos\/rhel8\/x86_64\/cuda-rhel8.repo<br \/>\ndnf clean all<br \/>\ndnf -y module <span class=\"token function\">install<\/span> nvidia-driver<br \/>\ndnf -y <span class=\"token function\">install<\/span> cuda-toolkit-11-8<\/p>\n<p>\u8bbe\u7f6e\u73af\u5883\u53d8\u91cf&#xff1a;<\/p>\n<p><span class=\"token function\">cat<\/span> <span class=\"token operator\">&lt;&lt;<\/span><span class=\"token string\">EOF<span class=\"token bash punctuation\"> <span class=\"token operator\">&gt;&gt;<\/span> \/etc\/profile.d\/cuda.sh<\/span><br \/>\nexport PATH&#061;\/usr\/local\/cuda-11.8\/bin:<span class=\"token environment constant\">$PATH<\/span><br \/>\nexport LD_LIBRARY_PATH&#061;\/usr\/local\/cuda-11.8\/lib64:<span class=\"token variable\">$LD_LIBRARY_PATH<\/span><br \/>\nEOF<\/span><br \/>\n<span class=\"token builtin class-name\">source<\/span> \/etc\/profile.d\/cuda.sh<\/p>\n<p>\u786e\u8ba4 CUDA \u5de5\u5177\u94fe&#xff1a;<\/p>\n<p>nvcc &#8211;version<\/p>\n<hr \/>\n<h3>\u4e8c\u3001\u7cfb\u7edf\u7ea7\u8c03\u4f18<\/h3>\n<p>\u4e3a\u5145\u5206\u53d1\u6325 GPU \u6f5c\u529b&#xff0c;\u9700\u4ece\u64cd\u4f5c\u7cfb\u7edf\u4e0e\u9a71\u52a8\u5c42\u7ea7\u8fdb\u884c\u4f18\u5316\u3002<\/p>\n<h4>1. \u6301\u4e45\u5316 GPU \u72b6\u6001\u4e0e ECC \u8bbe\u7f6e<\/h4>\n<p>\u786e\u4fdd GPU \u5728\u65e0\u524d\u7aef\u8d1f\u8f7d\u65f6\u4ecd\u7ef4\u6301\u9a71\u52a8\u521d\u59cb\u5316\u72b6\u6001&#xff1a;<\/p>\n<p>nvidia-smi -pm <span class=\"token number\">1<\/span><\/p>\n<p>\u6839\u636e\u9700\u6c42\u5f00\u542f\/\u5173\u95ed ECC&#xff08;Error Correcting Code&#xff09;&#xff1a;<\/p>\n<p>nvidia-smi -i <span class=\"token number\">0<\/span> &#8211;ecc-mode<span class=\"token operator\">&#061;<\/span><span class=\"token number\">1<\/span><\/p>\n<p>ECC \u53ef\u63d0\u5347\u957f\u671f\u7a33\u5b9a\u6027&#xff0c;\u4f46\u8f7b\u5fae\u964d\u4f4e\u5cf0\u503c\u6027\u80fd\u3002<\/p>\n<hr \/>\n<h4>2. CPU \u4e0e NUMA \u4f18\u5316<\/h4>\n<p>\u63a8\u8350\u7cfb\u7edf\u63a8\u7406\u5f80\u5f80\u9700\u8981 CPU \u4e0e GPU \u534f\u540c\u5e76\u884c&#xff1a;<\/p>\n<ul>\n<li>\u5c06 CUDA \u4e0a\u4e0b\u6587\u7ed1\u5b9a\u5230\u7279\u5b9a NUMA Node<\/li>\n<li>\u4f7f\u7528 numactl \u5206\u914d\u5185\u5b58\u4e0e CPU \u4eb2\u548c\u6027<\/li>\n<\/ul>\n<p>\u793a\u4f8b&#xff1a;<\/p>\n<p>numactl &#8211;cpunodebind<span class=\"token operator\">&#061;<\/span><span class=\"token number\">0<\/span> &#8211;membind<span class=\"token operator\">&#061;<\/span><span class=\"token number\">0<\/span> python3 inference_server.py<\/p>\n<p>\u5c06\u63a8\u7406\u8fdb\u7a0b\u7ed1\u5b9a\u81f3\u7b2c 0 \u53f7 NUMA \u8282\u70b9&#xff0c;\u4ee5\u51cf\u5c11\u8de8\u8282\u70b9\u8bbf\u95ee\u5ef6\u8fdf\u3002<\/p>\n<hr \/>\n<h4>3. CGroup \u548c Docker Runtime&#xff08;\u5982\u5bb9\u5668\u5316\u90e8\u7f72&#xff09;<\/h4>\n<p>\u82e5\u91c7\u7528\u5bb9\u5668\u5316\u90e8\u7f72&#xff0c;\u5b89\u88c5 NVIDIA Container Toolkit&#xff1a;<\/p>\n<p><span class=\"token assign-left variable\">distribution<\/span><span class=\"token operator\">&#061;<\/span><span class=\"token variable\"><span class=\"token variable\">$(<\/span><span class=\"token builtin class-name\">.<\/span> \/etc\/os-release<span class=\"token punctuation\">;<\/span><span class=\"token builtin class-name\">echo<\/span> $ID$VERSION_ID<span class=\"token variable\">)<\/span><\/span><br \/>\n<span class=\"token function\">curl<\/span> -s -L https:\/\/nvidia.github.io\/nvidia-docker\/<span class=\"token variable\">$distribution<\/span>\/nvidia-docker.repo <span class=\"token operator\">|<\/span> <span class=\"token function\">tee<\/span> \/etc\/yum.repos.d\/nvidia-docker.repo<br \/>\ndnf clean expire-cache<br \/>\ndnf -y <span class=\"token function\">install<\/span> nvidia-docker2<br \/>\nsystemctl restart <span class=\"token function\">docker<\/span><\/p>\n<p>\u8fd0\u884c\u652f\u6301 GPU \u7684\u5bb9\u5668&#xff1a;<\/p>\n<p><span class=\"token function\">docker<\/span> run &#8211;gpus all &#8211;cpus <span class=\"token number\">16<\/span> &#8211;memory 64g <span class=\"token punctuation\">\\\\<\/span><br \/>\n  &#8211;name ai_recomm_infer -d my_infer_image:latest<\/p>\n<hr \/>\n<h3>\u4e09\u3001\u5b89\u88c5\u4e0e\u96c6\u6210\u63a8\u7406\u6846\u67b6<\/h3>\n<h4>1. \u5b89\u88c5 cuDNN \u4e0e TensorRT<\/h4>\n<p>cuDNN \u548c TensorRT \u5747\u662f\u6027\u80fd\u5173\u952e\u7ec4\u4ef6\u3002\u4e0b\u8f7d\u5bf9\u5e94 CUDA 11.8 \u7684 .rpm \u5305&#xff1a;<\/p>\n<p>dnf -y <span class=\"token function\">install<\/span> libcudnn8 libcudnn8-devel<br \/>\ndnf -y <span class=\"token function\">install<\/span> tensorrt8 tensorrt8-devel<\/p>\n<p>\u9a8c\u8bc1\u7248\u672c&#xff1a;<\/p>\n<p><span class=\"token function\">cat<\/span> \/usr\/include\/cudnn_version.h <span class=\"token operator\">|<\/span> <span class=\"token function\">grep<\/span> CUDNN_MAJOR -A <span class=\"token number\">2<\/span><\/p>\n<hr \/>\n<h4>2. Python \u73af\u5883\u4e0e\u6df1\u5ea6\u5b66\u4e60\u5e93<\/h4>\n<p>\u5efa\u8bae\u901a\u8fc7 conda \u7ba1\u7406 Python \u73af\u5883&#xff1a;<\/p>\n<p>conda create -n ai_infer <span class=\"token assign-left variable\">python<\/span><span class=\"token operator\">&#061;<\/span><span class=\"token number\">3.9<\/span><br \/>\nconda activate ai_infer<br \/>\npip <span class=\"token function\">install<\/span> numpy tensorflow-gpu<span class=\"token operator\">&#061;&#061;<\/span><span class=\"token number\">2.9<\/span>.1 <span class=\"token assign-left variable\">torch<\/span><span class=\"token operator\">&#061;&#061;<\/span><span class=\"token number\">1.12<\/span>.1<br \/>\npip <span class=\"token function\">install<\/span> onnx onnxruntime-gpu<\/p>\n<p>\u6ce8\u610f&#xff1a;TensorFlow GPU \u7248\u672c\u9700\u4e0e CUDA\/cuDNN \u5339\u914d\u3002<\/p>\n<hr \/>\n<h3>\u56db\u3001\u63a8\u8350\u6a21\u578b\u63a8\u7406\u4f18\u5316\u5b9e\u8df5<\/h3>\n<p>\u672c\u6587\u4ee5\u4e00\u4e2a\u5178\u578b\u7684\u6df1\u5ea6\u63a8\u8350\u6a21\u578b&#xff08;Dense &#043; Embedding &#043; MLP \u6df7\u5408\u7ed3\u6784&#xff09;\u63a8\u7406\u4e3a\u4f8b&#xff0c;\u5c55\u793a\u5982\u4f55\u4f18\u5316\u5b9e\u65f6\u5ef6\u8fdf\u3002<\/p>\n<p>\u6838\u5fc3\u76ee\u6807&#xff1a;<\/p>\n<ul>\n<li>\u6279\u5927\u5c0f&#xff08;batch size&#xff09;\u8c03\u6574<\/li>\n<li>TensorRT \u7cbe\u5ea6\u4e0e\u5f15\u64ce\u4f18\u5316<\/li>\n<li>\u5185\u5b58\u590d\u7528\u4e0e\u5e76\u53d1\u6267\u884c<\/li>\n<\/ul>\n<hr \/>\n<h4>1. \u6a21\u578b\u5bfc\u51fa\u4e0e TensorRT \u4f18\u5316&#xff08;ONNX \u2192 TensorRT&#xff09;<\/h4>\n<p>\u628a\u8bad\u7ec3\u597d\u7684 PyTorch \u6a21\u578b\u5bfc\u51fa\u4e3a ONNX&#xff1a;<\/p>\n<p><span class=\"token keyword\">import<\/span> torch<\/p>\n<p>model <span class=\"token operator\">&#061;<\/span> torch<span class=\"token punctuation\">.<\/span>load<span class=\"token punctuation\">(<\/span><span class=\"token string\">&#034;deep_recommend.pt&#034;<\/span><span class=\"token punctuation\">)<\/span><br \/>\nmodel<span class=\"token punctuation\">.<\/span><span class=\"token builtin\">eval<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><\/p>\n<p>dummy_input <span class=\"token operator\">&#061;<\/span> <span class=\"token punctuation\">{<\/span><br \/>\n    <span class=\"token string\">&#034;dense_features&#034;<\/span><span class=\"token punctuation\">:<\/span> torch<span class=\"token punctuation\">.<\/span>randn<span class=\"token punctuation\">(<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">64<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">,<\/span><br \/>\n    <span class=\"token string\">&#034;sparse_features&#034;<\/span><span class=\"token punctuation\">:<\/span> torch<span class=\"token punctuation\">.<\/span>randint<span class=\"token punctuation\">(<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">10000<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token punctuation\">(<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">32<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><br \/>\n<span class=\"token punctuation\">}<\/span><\/p>\n<p>torch<span class=\"token punctuation\">.<\/span>onnx<span class=\"token punctuation\">.<\/span>export<span class=\"token punctuation\">(<\/span><br \/>\n    model<span class=\"token punctuation\">,<\/span> <span class=\"token punctuation\">(<\/span>dummy_input<span class=\"token punctuation\">[<\/span><span class=\"token string\">&#034;dense_features&#034;<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> dummy_input<span class=\"token punctuation\">[<\/span><span class=\"token string\">&#034;sparse_features&#034;<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">,<\/span><br \/>\n    <span class=\"token string\">&#034;deep_recommend.onnx&#034;<\/span><span class=\"token punctuation\">,<\/span><br \/>\n    opset_version<span class=\"token operator\">&#061;<\/span><span class=\"token number\">13<\/span><span class=\"token punctuation\">,<\/span><br \/>\n    input_names<span class=\"token operator\">&#061;<\/span><span class=\"token punctuation\">[<\/span><span class=\"token string\">&#034;dense&#034;<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token string\">&#034;sparse&#034;<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span><br \/>\n    output_names<span class=\"token operator\">&#061;<\/span><span class=\"token punctuation\">[<\/span><span class=\"token string\">&#034;score&#034;<\/span><span class=\"token punctuation\">]<\/span><br \/>\n<span class=\"token punctuation\">)<\/span><\/p>\n<p>\u4f7f\u7528 TensorRT \u8f6c\u6362&#xff1a;<\/p>\n<p>trtexec &#8211;onnx<span class=\"token operator\">&#061;<\/span>deep_recommend.onnx <span class=\"token punctuation\">\\\\<\/span><br \/>\n  &#8211;saveEngine<span class=\"token operator\">&#061;<\/span>deep_recommend.trt <span class=\"token punctuation\">\\\\<\/span><br \/>\n  &#8211;fp16 &#8211;workspace<span class=\"token operator\">&#061;<\/span><span class=\"token number\">4096<\/span> <span class=\"token punctuation\">\\\\<\/span><br \/>\n  &#8211;batch<span class=\"token operator\">&#061;<\/span><span class=\"token number\">1<\/span> &#8211;verbose<\/p>\n<p>\u8bf4\u660e&#xff1a;<\/p>\n<ul>\n<li>&#8211;fp16&#xff1a;\u542f\u7528\u534a\u7cbe\u5ea6\u52a0\u901f&#xff0c;\u9002\u5408\u63a8\u8350\u6a21\u578b\u7cbe\u5ea6\u53ef\u63a5\u53d7\u7684\u573a\u666f<\/li>\n<li>&#8211;workspace&#061;4096&#xff1a;\u6700\u5927 GPU \u5de5\u4f5c\u7a7a\u95f4 4GB<\/li>\n<\/ul>\n<hr \/>\n<h4>2. \u5b9e\u65f6\u63a8\u7406\u670d\u52a1\u793a\u4f8b&#xff08;Python &#043; TensorRT&#xff09;<\/h4>\n<p>\u4f7f\u7528 TensorRT Python API \u52a0\u8f7d\u5f15\u64ce&#xff1a;<\/p>\n<p><span class=\"token keyword\">import<\/span> tensorrt <span class=\"token keyword\">as<\/span> trt<br \/>\n<span class=\"token keyword\">import<\/span> pycuda<span class=\"token punctuation\">.<\/span>driver <span class=\"token keyword\">as<\/span> cuda<br \/>\n<span class=\"token keyword\">import<\/span> pycuda<span class=\"token punctuation\">.<\/span>autoinit<br \/>\n<span class=\"token keyword\">import<\/span> numpy <span class=\"token keyword\">as<\/span> np<\/p>\n<p>TRT_LOGGER <span class=\"token operator\">&#061;<\/span> trt<span class=\"token punctuation\">.<\/span>Logger<span class=\"token punctuation\">(<\/span>trt<span class=\"token punctuation\">.<\/span>Logger<span class=\"token punctuation\">.<\/span>WARNING<span class=\"token punctuation\">)<\/span><\/p>\n<p><span class=\"token keyword\">def<\/span> <span class=\"token function\">load_engine<\/span><span class=\"token punctuation\">(<\/span>engine_file<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">:<\/span><br \/>\n    <span class=\"token keyword\">with<\/span> <span class=\"token builtin\">open<\/span><span class=\"token punctuation\">(<\/span>engine_file<span class=\"token punctuation\">,<\/span> <span class=\"token string\">&#034;rb&#034;<\/span><span class=\"token punctuation\">)<\/span> <span class=\"token keyword\">as<\/span> f<span class=\"token punctuation\">,<\/span> trt<span class=\"token punctuation\">.<\/span>Runtime<span class=\"token punctuation\">(<\/span>TRT_LOGGER<span class=\"token punctuation\">)<\/span> <span class=\"token keyword\">as<\/span> runtime<span class=\"token punctuation\">:<\/span><br \/>\n        <span class=\"token keyword\">return<\/span> runtime<span class=\"token punctuation\">.<\/span>deserialize_cuda_engine<span class=\"token punctuation\">(<\/span>f<span class=\"token punctuation\">.<\/span>read<span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><\/p>\n<p>engine <span class=\"token operator\">&#061;<\/span> load_engine<span class=\"token punctuation\">(<\/span><span class=\"token string\">&#034;deep_recommend.trt&#034;<\/span><span class=\"token punctuation\">)<\/span><br \/>\ncontext <span class=\"token operator\">&#061;<\/span> engine<span class=\"token punctuation\">.<\/span>create_execution_context<span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><\/p>\n<p><span class=\"token comment\"># \u9884\u5206\u914d GPU \u7f13\u51b2\u533a<\/span><br \/>\ninputs<span class=\"token punctuation\">,<\/span> outputs<span class=\"token punctuation\">,<\/span> bindings <span class=\"token operator\">&#061;<\/span> <span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">]<\/span><br \/>\n<span class=\"token keyword\">for<\/span> binding <span class=\"token keyword\">in<\/span> engine<span class=\"token punctuation\">:<\/span><br \/>\n    size <span class=\"token operator\">&#061;<\/span> trt<span class=\"token punctuation\">.<\/span>volume<span class=\"token punctuation\">(<\/span>engine<span class=\"token punctuation\">.<\/span>get_binding_shape<span class=\"token punctuation\">(<\/span>binding<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span> <span class=\"token operator\">*<\/span> engine<span class=\"token punctuation\">.<\/span>max_batch_size<br \/>\n    dtype <span class=\"token operator\">&#061;<\/span> trt<span class=\"token punctuation\">.<\/span>nptype<span class=\"token punctuation\">(<\/span>engine<span class=\"token punctuation\">.<\/span>get_binding_dtype<span class=\"token punctuation\">(<\/span>binding<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><br \/>\n    gpu_mem <span class=\"token operator\">&#061;<\/span> cuda<span class=\"token punctuation\">.<\/span>mem_alloc<span class=\"token punctuation\">(<\/span>size <span class=\"token operator\">*<\/span> dtype<span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">.<\/span>nbytes<span class=\"token punctuation\">)<\/span><br \/>\n    bindings<span class=\"token punctuation\">.<\/span>append<span class=\"token punctuation\">(<\/span><span class=\"token builtin\">int<\/span><span class=\"token punctuation\">(<\/span>gpu_mem<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><br \/>\n    <span class=\"token keyword\">if<\/span> engine<span class=\"token punctuation\">.<\/span>binding_is_input<span class=\"token punctuation\">(<\/span>binding<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">:<\/span><br \/>\n        inputs<span class=\"token punctuation\">.<\/span>append<span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">(<\/span>gpu_mem<span class=\"token punctuation\">,<\/span> size<span class=\"token punctuation\">,<\/span> dtype<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><br \/>\n    <span class=\"token keyword\">else<\/span><span class=\"token punctuation\">:<\/span><br \/>\n        outputs<span class=\"token punctuation\">.<\/span>append<span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">(<\/span>gpu_mem<span class=\"token punctuation\">,<\/span> size<span class=\"token punctuation\">,<\/span> dtype<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><\/p>\n<p><span class=\"token keyword\">def<\/span> <span class=\"token function\">infer<\/span><span class=\"token punctuation\">(<\/span>dense_np<span class=\"token punctuation\">,<\/span> sparse_np<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">:<\/span><br \/>\n    <span class=\"token comment\"># \u5185\u5b58\u5f02\u6b65\u62f7\u8d1d<\/span><br \/>\n    cuda<span class=\"token punctuation\">.<\/span>memcpy_htod<span class=\"token punctuation\">(<\/span>inputs<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> dense_np<span class=\"token punctuation\">)<\/span><br \/>\n    cuda<span class=\"token punctuation\">.<\/span>memcpy_htod<span class=\"token punctuation\">(<\/span>inputs<span class=\"token punctuation\">[<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> sparse_np<span class=\"token punctuation\">)<\/span><\/p>\n<p>    context<span class=\"token punctuation\">.<\/span>execute<span class=\"token punctuation\">(<\/span>batch_size<span class=\"token operator\">&#061;<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> bindings<span class=\"token operator\">&#061;<\/span>bindings<span class=\"token punctuation\">)<\/span><\/p>\n<p>    out <span class=\"token operator\">&#061;<\/span> np<span class=\"token punctuation\">.<\/span>empty<span class=\"token punctuation\">(<\/span>outputs<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">[<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">,<\/span> dtype<span class=\"token operator\">&#061;<\/span>outputs<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">[<\/span><span class=\"token number\">2<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span><br \/>\n    cuda<span class=\"token punctuation\">.<\/span>memcpy_dtoh<span class=\"token punctuation\">(<\/span>out<span class=\"token punctuation\">,<\/span> outputs<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span><br \/>\n    <span class=\"token keyword\">return<\/span> out<\/p>\n<p><span class=\"token comment\"># \u793a\u4f8b\u8c03\u7528<\/span><br \/>\ndense <span class=\"token operator\">&#061;<\/span> np<span class=\"token punctuation\">.<\/span>random<span class=\"token punctuation\">.<\/span>rand<span class=\"token punctuation\">(<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">64<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">.<\/span>astype<span class=\"token punctuation\">(<\/span>np<span class=\"token punctuation\">.<\/span>float32<span class=\"token punctuation\">)<\/span><br \/>\nsparse <span class=\"token operator\">&#061;<\/span> np<span class=\"token punctuation\">.<\/span>random<span class=\"token punctuation\">.<\/span>randint<span class=\"token punctuation\">(<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">10000<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token punctuation\">(<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">32<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">.<\/span>astype<span class=\"token punctuation\">(<\/span>np<span class=\"token punctuation\">.<\/span>int32<span class=\"token punctuation\">)<\/span><br \/>\nscore <span class=\"token operator\">&#061;<\/span> infer<span class=\"token punctuation\">(<\/span>dense<span class=\"token punctuation\">,<\/span> sparse<span class=\"token punctuation\">)<\/span><\/p>\n<hr \/>\n<h3>\u4e94\u3001\u6027\u80fd\u8bc4\u6d4b\u4e0e\u5bf9\u6bd4<\/h3>\n<p>\u901a\u8fc7\u5b9e\u9645\u63a8\u7406\u6d4b\u8bd5\u8bc4\u4f30\u4e0d\u540c\u914d\u7f6e\u4e0b\u7684\u6027\u80fd\u8868\u73b0\u3002<\/p>\n<table>\n<tr>\u914d\u7f6e\u65b9\u6848Precision\u5e73\u5747\u5ef6\u8fdf (ms)\u541e\u5410 (qps)GPU \u5229\u7528\u7387<\/tr>\n<tbody>\n<tr>\n<td>\u539f\u59cb TensorFlow CPU<\/td>\n<td>FP32<\/td>\n<td>85.2<\/td>\n<td>120<\/td>\n<td>0% CPU-bound<\/td>\n<\/tr>\n<tr>\n<td>TensorFlow GPU&#xff08;\u65e0 TensorRT&#xff09;<\/td>\n<td>FP32<\/td>\n<td>18.7<\/td>\n<td>550<\/td>\n<td>75%<\/td>\n<\/tr>\n<tr>\n<td>TensorRT \u5f15\u64ce&#xff08;FP32&#xff09;<\/td>\n<td>FP32<\/td>\n<td>12.4<\/td>\n<td>830<\/td>\n<td>88%<\/td>\n<\/tr>\n<tr>\n<td>TensorRT \u5f15\u64ce&#xff08;FP16&#xff09;<\/td>\n<td>FP16<\/td>\n<td>8.9<\/td>\n<td>1120<\/td>\n<td>92%<\/td>\n<\/tr>\n<tr>\n<td>TensorRT &#043; \u5e76\u53d1\u6d41\u6267\u884c<\/td>\n<td>FP16<\/td>\n<td>7.3<\/td>\n<td>1380<\/td>\n<td>95%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\u7ed3\u8bba&#xff1a;<\/p>\n<ul>\n<li>\u4f7f\u7528 TensorRT \u5f15\u64ce\u663e\u8457\u964d\u4f4e\u5355\u6b21\u63a8\u7406\u5ef6\u8fdf&#xff08;\u6700\u4f18\u7ea6 7.3 ms&#xff09;\u3002<\/li>\n<li>FP16 \u7cbe\u5ea6\u5728\u591a\u6570\u63a8\u8350\u573a\u666f\u4e2d\u4e0d\u4f1a\u663e\u8457\u5f71\u54cd\u4e1a\u52a1\u6307\u6807&#xff0c;\u4f46\u80fd\u63d0\u5347\u541e\u5410\u4e0e\u8d44\u6e90\u5229\u7528\u3002<\/li>\n<li>\u5e76\u53d1 CUDA \u6d41\u4e0e NUMA \u4eb2\u548c\u4f18\u5316\u63d0\u5347\u786c\u4ef6\u5229\u7528\u7387\u3002<\/li>\n<\/ul>\n<hr \/>\n<h3>\u516d\u3001\u5e38\u89c1\u95ee\u9898\u4e0e\u8c03\u4f18\u5efa\u8bae<\/h3>\n<h4>1. \u5ef6\u8fdf\u4e0d\u7a33\u5b9a<\/h4>\n<ul>\n<li>\u68c0\u67e5 CPU \u9891\u7387\u7b56\u7565&#xff08;\u5efa\u8bae\u8bbe\u4e3a performance&#xff09;<\/li>\n<li>\u786e\u4fdd\u65e0\u5927\u89c4\u6a21\u5185\u5b58\u9875\u6296\u52a8&#xff0c;\u542f\u7528 HugePages<\/li>\n<li>\u7ed1\u5b9a\u56fa\u5b9a NUMA \u8282\u70b9\u907f\u514d\u8de8\u8282\u70b9\u8bbf\u95ee<\/li>\n<\/ul>\n<h4>2. GPU \u5229\u7528\u7387\u4f4e<\/h4>\n<ul>\n<li>\u589e\u5927 batch size&#xff08;\u4f46\u9700\u5e73\u8861\u5ef6\u8fdf&#xff09;<\/li>\n<li>\u4f7f\u7528 TensorRT \u5e76\u884c\u6d41\u6267\u884c<\/li>\n<li>\u8c03\u6574 cuBLAS \u548c cuDNN \u7684\u7b97\u6cd5\u9009\u62e9<\/li>\n<\/ul>\n<h4>3. \u5185\u5b58\u5360\u7528\u8fc7\u9ad8<\/h4>\n<ul>\n<li>\u91cd\u7528 CUDA \u7f13\u51b2<\/li>\n<li>\u907f\u514d\u5185\u5b58\u788e\u7247<\/li>\n<li>\u4f7f\u7528\u5185\u5b58\u6c60&#xff08;TensorRT \u81ea\u5e26&#xff09;<\/li>\n<\/ul>\n<hr \/>\n<h3>\u4e03\u3001\u7ed3\u8bed<\/h3>\n<p>A5\u6570\u636e\u901a\u8fc7\u5728 RHEL 8 \u4e0a\u7cbe\u786e\u90e8\u7f72 NVIDIA CUDA 11 \u5de5\u5177\u94fe\u3001\u9a71\u52a8\u4e0e\u6df1\u5ea6\u5b66\u4e60\u5e93&#xff0c;\u5e76\u7ed3\u5408 TensorRT \u4f18\u5316\u63a8\u7406\u5f15\u64ce&#xff0c;\u53ef\u4ee5\u5728 GPU \u663e\u5361\u670d\u52a1\u5668\u4e0a\u663e\u8457\u63d0\u5347 AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\u6027\u80fd\u3002\u672c\u6587\u4ece\u7cfb\u7edf\u5c42\u3001\u6846\u67b6\u5c42\u5230\u4ee3\u7801\u5b9e\u73b0\u5c42\u8fdb\u884c\u4e86\u5168\u65b9\u4f4d\u7684\u8bb2\u89e3\u4e0e\u5b9e\u8df5\u5c55\u793a&#xff0c;\u529b\u6c42\u4e3a\u771f\u5b9e\u4e1a\u52a1\u573a\u666f\u63d0\u4f9b\u53ef\u590d\u5236\u7684\u89e3\u51b3\u65b9\u6848\u3002<\/p>\n<p>\u5982\u9700\u8fdb\u4e00\u6b65\u9488\u5bf9\u7279\u5b9a\u6a21\u578b\u67b6\u6784&#xff08;\u5982 DeepFM\u3001DIN\u3001DCNv2 \u7b49&#xff09;\u8fdb\u884c\u7ec6\u7c92\u5ea6\u8c03\u4f18&#xff0c;\u53ef\u7ee7\u7eed\u7ec6\u5316 TensorRT \u914d\u7f6e\u3001\u8c03\u7814\u6df7\u5408\u7cbe\u5ea6\u7b56\u7565&#xff0c;\u4ee5\u53ca\u878d\u5408\u81ea\u5b9a\u4e49 CUDA \u6838\u5fc3\u4ee5\u6ee1\u8db3\u66f4\u9ad8\u6027\u80fd\u76ee\u6807\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5728\u5927\u89c4\u6a21\u5728\u7ebf\u63a8\u8350\u7cfb\u7edf\u4e2d&#xff0c;\u5b9e\u65f6\u63a8\u7406\u7684\u6027\u80fd\u76f4\u63a5\u5f71\u54cd\u7528\u6237\u4f53\u9a8c\u548c\u4e1a\u52a1\u8f6c\u5316\u6548\u7387\u3002\u4f20\u7edf CPU \u63a8\u7406\u5728\u9ad8\u5e76\u53d1\u3001\u4f4e\u5ef6\u8fdf\u573a\u666f\u4e0b\u5f80\u5f80\u96be\u4ee5\u6ee1\u8db3\u5b9e\u65f6 SLA&#xff08;\u5982 10ms \u5185\u54cd\u5e94&#xff09;\u3002\u5229\u7528 GPU \u52a0\u901f\u63a8\u7406&#xff0c;\u5c24\u5176\u662f\u91c7\u7528 NVIDIA CUDA \u751f\u6001&#xff08;\u5982 cuBLAS\u3001cuDNN\u3001TensorRT&#xff09;&#xff0c;\u53ef\u4ee5\u6781\u5927\u63d0\u5347\u63a8\u7406\u541e\u5410\u548c\u54cd\u5e94\u901f\u5ea6\u3002A5\u6570\u636e\u4ee5 Red Hat Enterprise Linux 8&amp;#xff<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[2756,50,43],"topic":[],"class_list":["post-70610","post","type-post","status-publish","format-standard","hentry","category-server","tag-neo4j","tag-50","tag-43"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.wsisp.com\/helps\/70610.html\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\" \/>\n<meta property=\"og:description\" content=\"\u5728\u5927\u89c4\u6a21\u5728\u7ebf\u63a8\u8350\u7cfb\u7edf\u4e2d&#xff0c;\u5b9e\u65f6\u63a8\u7406\u7684\u6027\u80fd\u76f4\u63a5\u5f71\u54cd\u7528\u6237\u4f53\u9a8c\u548c\u4e1a\u52a1\u8f6c\u5316\u6548\u7387\u3002\u4f20\u7edf CPU \u63a8\u7406\u5728\u9ad8\u5e76\u53d1\u3001\u4f4e\u5ef6\u8fdf\u573a\u666f\u4e0b\u5f80\u5f80\u96be\u4ee5\u6ee1\u8db3\u5b9e\u65f6 SLA&#xff08;\u5982 10ms \u5185\u54cd\u5e94&#xff09;\u3002\u5229\u7528 GPU \u52a0\u901f\u63a8\u7406&#xff0c;\u5c24\u5176\u662f\u91c7\u7528 NVIDIA CUDA \u751f\u6001&#xff08;\u5982 cuBLAS\u3001cuDNN\u3001TensorRT&#xff09;&#xff0c;\u53ef\u4ee5\u6781\u5927\u63d0\u5347\u63a8\u7406\u541e\u5410\u548c\u54cd\u5e94\u901f\u5ea6\u3002A5\u6570\u636e\u4ee5 Red Hat Enterprise Linux 8&amp;#xff\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.wsisp.com\/helps\/70610.html\" \/>\n<meta property=\"og:site_name\" content=\"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\" \/>\n<meta property=\"article:published_time\" content=\"2026-02-01T21:05:53+00:00\" \/>\n<meta name=\"author\" content=\"admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/70610.html\",\"url\":\"https:\/\/www.wsisp.com\/helps\/70610.html\",\"name\":\"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\",\"isPartOf\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/#website\"},\"datePublished\":\"2026-02-01T21:05:53+00:00\",\"dateModified\":\"2026-02-01T21:05:53+00:00\",\"author\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.wsisp.com\/helps\/70610.html#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.wsisp.com\/helps\/70610.html\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/70610.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u9996\u9875\",\"item\":\"https:\/\/www.wsisp.com\/helps\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#website\",\"url\":\"https:\/\/www.wsisp.com\/helps\/\",\"name\":\"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3\",\"description\":\"\u9999\u6e2f\u670d\u52a1\u5668_\u9999\u6e2f\u4e91\u670d\u52a1\u5668\u8d44\u8baf_\u670d\u52a1\u5668\u5e2e\u52a9\u6587\u6863_\u670d\u52a1\u5668\u6559\u7a0b\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.wsisp.com\/helps\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41\",\"name\":\"admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery\",\"contentUrl\":\"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery\",\"caption\":\"admin\"},\"sameAs\":[\"http:\/\/wp.wsisp.com\"],\"url\":\"https:\/\/www.wsisp.com\/helps\/author\/admin\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.wsisp.com\/helps\/70610.html","og_locale":"zh_CN","og_type":"article","og_title":"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","og_description":"\u5728\u5927\u89c4\u6a21\u5728\u7ebf\u63a8\u8350\u7cfb\u7edf\u4e2d&#xff0c;\u5b9e\u65f6\u63a8\u7406\u7684\u6027\u80fd\u76f4\u63a5\u5f71\u54cd\u7528\u6237\u4f53\u9a8c\u548c\u4e1a\u52a1\u8f6c\u5316\u6548\u7387\u3002\u4f20\u7edf CPU \u63a8\u7406\u5728\u9ad8\u5e76\u53d1\u3001\u4f4e\u5ef6\u8fdf\u573a\u666f\u4e0b\u5f80\u5f80\u96be\u4ee5\u6ee1\u8db3\u5b9e\u65f6 SLA&#xff08;\u5982 10ms \u5185\u54cd\u5e94&#xff09;\u3002\u5229\u7528 GPU \u52a0\u901f\u63a8\u7406&#xff0c;\u5c24\u5176\u662f\u91c7\u7528 NVIDIA CUDA \u751f\u6001&#xff08;\u5982 cuBLAS\u3001cuDNN\u3001TensorRT&#xff09;&#xff0c;\u53ef\u4ee5\u6781\u5927\u63d0\u5347\u63a8\u7406\u541e\u5410\u548c\u54cd\u5e94\u901f\u5ea6\u3002A5\u6570\u636e\u4ee5 Red Hat Enterprise Linux 8&amp;#xff","og_url":"https:\/\/www.wsisp.com\/helps\/70610.html","og_site_name":"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","article_published_time":"2026-02-01T21:05:53+00:00","author":"admin","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"admin","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"4 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.wsisp.com\/helps\/70610.html","url":"https:\/\/www.wsisp.com\/helps\/70610.html","name":"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f - \u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","isPartOf":{"@id":"https:\/\/www.wsisp.com\/helps\/#website"},"datePublished":"2026-02-01T21:05:53+00:00","dateModified":"2026-02-01T21:05:53+00:00","author":{"@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41"},"breadcrumb":{"@id":"https:\/\/www.wsisp.com\/helps\/70610.html#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.wsisp.com\/helps\/70610.html"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.wsisp.com\/helps\/70610.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u9996\u9875","item":"https:\/\/www.wsisp.com\/helps"},{"@type":"ListItem","position":2,"name":"\u5982\u4f55\u5728 RHEL 8 \u4e0a\u914d\u7f6e\u5e76\u4f18\u5316 NVIDIA CUDA 11\uff0c\u5728\u663e\u5361\u670d\u52a1\u5668\u4e0a\u52a0\u901f AI \u63a8\u8350\u7cfb\u7edf\u7684\u5b9e\u65f6\u63a8\u7406\uff1f"}]},{"@type":"WebSite","@id":"https:\/\/www.wsisp.com\/helps\/#website","url":"https:\/\/www.wsisp.com\/helps\/","name":"\u7f51\u7855\u4e92\u8054\u5e2e\u52a9\u4e2d\u5fc3","description":"\u9999\u6e2f\u670d\u52a1\u5668_\u9999\u6e2f\u4e91\u670d\u52a1\u5668\u8d44\u8baf_\u670d\u52a1\u5668\u5e2e\u52a9\u6587\u6863_\u670d\u52a1\u5668\u6559\u7a0b","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.wsisp.com\/helps\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"zh-Hans"},{"@type":"Person","@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/358e386c577a3ab51c4493330a20ad41","name":"admin","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.wsisp.com\/helps\/#\/schema\/person\/image\/","url":"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery","contentUrl":"https:\/\/gravatar.wp-china-yes.net\/avatar\/?s=96&d=mystery","caption":"admin"},"sameAs":["http:\/\/wp.wsisp.com"],"url":"https:\/\/www.wsisp.com\/helps\/author\/admin"}]}},"_links":{"self":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts\/70610","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/comments?post=70610"}],"version-history":[{"count":0,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/posts\/70610\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/media?parent=70610"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/categories?post=70610"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/tags?post=70610"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.wsisp.com\/helps\/wp-json\/wp\/v2\/topic?post=70610"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}