numpy scipy pandas sk-learn gensim

    xiaoxiao2021-12-01  41

    Numpy Scipy 矩阵向量处理。 Numpy provides a high-performance multidimensional array and basic tools to compute with and manipulate these arrays.  SciPy  builds on this, and provides a large number of functions that operate on numpy arrays and are useful for different types of scientific and engineering applications. 参考: http://old.sebug.net/paper/books/scipydoc/numpy_intro.html http://cs231n.github.io/python-numpy-tutorial/ (python基础,numpy, scipy, matplotlib均包含在内) numpy用法总结: https://github.com/zhangweijiqn/testPython/blob/master/src/NumpyTest/testNumpy.py Scikit-learn 数据建模分析处理。 scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license. conda update sklearn: conda update scikit - learn 官网: http://scikit-learn.org/stable/index.html 文档还是很详细的,官网主页列出了很多个机器学习的项: 在user guide中列出了所有包含的项目: http://scikit-learn.org/stable/user_guide.html 安装: http://scikit-learn.org/stable/install.html pip install -U scikit-learn (需要提前安装numpy and scipy) 这种方式在安装完后, from sklearn.ensemble import RandomForestClassifier , 可能会报ImportError: cannot import name check_arrays的错误. 原因参考: http://stackoverflow.com/questions/29596237/import-check-arrays-from-sklearn, 解决: conda update scikit - learn sklearn model selection中带有GridSearch的功能。 API: http://scikit-learn.org/stable/modules/classes.html sklearn提供了TFIDF算法,可以对中文提取关键词以及向量化,下面是参考博文 : http://www.cnblogs.com/chenbjin/p/3851165.html Pandas 数据读写相关。 powerful Python data analysis toolkit. 官方主页: http://pandas.pydata.org/ tutorial: http://pandas.pydata.org/pandas-docs/stable/10min.html document: http://pandas.pydata.org/pandas-docs/stable/index.html 读取csv文件: http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.1/cookbook/Chapter 1 - Reading from a CSV.ipynb 视频教程: http://www.dataschool.io/easier-data-analysis-with-pandas/ gensim Gensim 是一个很专业的主题模型Python工具包。 Gensim  is an  open-source   vector space modeling  and  topic modeling  toolkit, implemented in the  Python  programming language. It uses  NumPy SciPy  and optionally  Cython  for performance. It is specifically intended for handling large text collections, using efficient online, incremental algorithms. Gensim is commercially supported by the startup RaRe Technologies. Gensim includes implementations of  tf-idf random projections word2vec  and document2vec algorithms,   hierarchical Dirichlet processes  (HDP),  latent semantic analysis  (LSA) and  latent Dirichlet allocation  (LDA) , including  distributed   parallel  versions. gensim:  http://radimrehurek.com/gensim/index.html github: https://github.com/RaRe-Technologies/gensim install: pip install gensim
    转载请注明原文地址: https://ju.6miu.com/read-679543.html

    最新回复(0)