{"id":9812,"date":"2024-12-17T18:38:58","date_gmt":"2024-12-17T10:38:58","guid":{"rendered":"http:\/\/www.yliyun.com\/?p=9812"},"modified":"2024-12-17T18:38:58","modified_gmt":"2024-12-17T10:38:58","slug":"python-%e8%af%ad%e8%a8%80%e6%a3%80%e6%b5%8b","status":"publish","type":"post","link":"http:\/\/www.yliyun.com\/2024\/12\/17\/python-%e8%af%ad%e8%a8%80%e6%a3%80%e6%b5%8b\/","title":{"rendered":"Python \u8bed\u8a00\u68c0\u6d4b"},"content":{"rendered":"\n
Python \u4e2d\u6709\u591a\u79cd\u4f18\u79c0\u7684\u8bed\u8a00\u8bc6\u522b\u5de5\u5177\uff0c\u4ee5\u4e0b\u662f\u4e00\u4e9b\u5e38\u7528\u7684\u5de5\u5177\u548c\u5e93\uff1a<\/p>\n\n\n\n
1. langdetect<\/strong><\/p>\n\n\n\n \u2022 \u7b80\u4ecb<\/strong>: langdetect \u662f\u4e00\u4e2a\u975e\u5e38\u6d41\u884c\u7684\u8bed\u8a00\u68c0\u6d4b\u5e93\uff0c\u57fa\u4e8e Google \u7684 language-detection \u9879\u76ee\u3002\u5b83\u53ef\u4ee5\u68c0\u6d4b\u591a\u79cd\u8bed\u8a00\uff0c\u5e76\u4e14\u5bf9\u4e8e\u77ed\u6587\u672c\u4e5f\u6709\u4e0d\u9519\u7684\u8bc6\u522b\u6548\u679c\u3002<\/p>\n\n\n\n \u2022 \u5b89\u88c5<\/strong>:<\/p>\n\n\n\n pip install langdetect<\/p>\n\n\n\n \u2022 \u4f7f\u7528\u793a\u4f8b<\/strong>:<\/p>\n\n\n\n from langdetect import detect<\/p>\n\n\n\n text = “Bonjour tout le monde”<\/p>\n\n\n\n language = detect(text)<\/p>\n\n\n\n print(language) # \u8f93\u51fa: ‘fr’ (\u6cd5\u8bed)<\/em><\/p>\n\n\n\n 2. langid<\/strong><\/p>\n\n\n\n \u2022 \u7b80\u4ecb<\/strong>: langid \u662f\u53e6\u4e00\u4e2a\u975e\u5e38\u5f3a\u5927\u7684\u8bed\u8a00\u8bc6\u522b\u5e93\uff0c\u652f\u630197\u79cd\u8bed\u8a00\u3002\u5b83\u7684\u7279\u70b9\u662f\u5b8c\u5168\u81ea\u5305\u542b\u4e14\u65e0\u9700\u5916\u90e8\u4f9d\u8d56\u3002<\/p>\n\n\n\n \u2022 \u5b89\u88c5<\/strong>:<\/p>\n\n\n\n pip install langid<\/p>\n\n\n\n \u2022 \u4f7f\u7528\u793a\u4f8b<\/strong>:<\/p>\n\n\n\n import langid<\/p>\n\n\n\n text = “Hola, \u00bfc\u00f3mo est\u00e1s?”<\/p>\n\n\n\n language, _ = langid.classify(text)<\/p>\n\n\n\n print(language) # \u8f93\u51fa: ‘es’ (\u897f\u73ed\u7259\u8bed)<\/em><\/p>\n\n\n\n 3. polyglot<\/strong><\/p>\n\n\n\n \u2022 \u7b80\u4ecb<\/strong>: polyglot \u662f\u4e00\u4e2a\u652f\u6301\u591a\u8bed\u8a00\u5904\u7406\u7684\u5e93\uff0c\u5b83\u4e0d\u4ec5\u63d0\u4f9b\u8bed\u8a00\u8bc6\u522b\u529f\u80fd\uff0c\u8fd8\u652f\u6301\u60c5\u611f\u5206\u6790\u3001\u5b9e\u4f53\u8bc6\u522b\u7b49\u591a\u79cd\u81ea\u7136\u8bed\u8a00\u5904\u7406\u4efb\u52a1\u3002<\/p>\n\n\n\n \u2022 \u5b89\u88c5<\/strong>:<\/p>\n\n\n\n pip install polyglot<\/p>\n\n\n\n \u2022 \u4f7f\u7528\u793a\u4f8b<\/strong>:<\/p>\n\n\n\n from polyglot.detect import Detector<\/p>\n\n\n\n text = “Ceci est un exemple de texte en fran\u00e7ais”<\/p>\n\n\n\n detector = Detector(text)<\/p>\n\n\n\n language = detector.language.code<\/p>\n\n\n\n print(language) # \u8f93\u51fa: ‘fr’ (\u6cd5\u8bed)<\/em><\/p>\n\n\n\n 4. TextBlob<\/strong><\/p>\n\n\n\n \u2022 \u7b80\u4ecb<\/strong>: TextBlob \u662f\u4e00\u4e2a\u7b80\u6d01\u6613\u7528\u7684\u81ea\u7136\u8bed\u8a00\u5904\u7406\u5de5\u5177\u5305\uff0c\u867d\u7136\u5b83\u4e3b\u8981\u7528\u4e8e\u60c5\u611f\u5206\u6790\u3001\u8bcd\u6027\u6807\u6ce8\u7b49\u4efb\u52a1\uff0c\u4f46\u4e5f\u652f\u6301\u8bed\u8a00\u8bc6\u522b\u3002<\/p>\n\n\n\n \u2022 \u5b89\u88c5<\/strong>:<\/p>\n\n\n\n pip install textblob<\/p>\n\n\n\n \u2022 \u4f7f\u7528\u793a\u4f8b<\/strong>:<\/p>\n\n\n\n from textblob import TextBlob<\/p>\n\n\n\n text = “Hello, how are you?”<\/p>\n\n\n\n blob = TextBlob(text)<\/p>\n\n\n\n print(blob.detect_language()) # \u8f93\u51fa: ‘en’ (\u82f1\u8bed)<\/em><\/p>\n\n\n\n