charnugagoo's recent timeline updates
charnugagoo's repos on GitHub
ASP · 5 watchers
WebCrawler
A (very primitive) web crawler in Python that attempts to do a limited crawl of the web.
C · 2 watchers
DeepLearning
Deep Learning (Python, C/C++, Java)
Python · 2 watchers
DeepLearningTutorials
Deep Learning Tutorial notes and code. See the wiki for more info.
C++ · 2 watchers
InvertedIndex
Scheme · 1 watchers
data_structure
data_structure using lisp
Matlab · 1 watchers
deep_learning
Stanford deep learning tutorial
Processing · 1 watchers
inforvis_sleep_cycle
C++ · 1 watchers
LeetCode
My LeetCodeSolving
1 watchers
WebSearchEngine
Mini Web Search Engine with Crawling, Inverted Indexing and Query Processing
C · 0 watchers
algorithms
Algorithms & Data Structures in C++
Python · 0 watchers
algorithms-1
module of algorithms for Python
Python · 0 watchers
assignment1-basics
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch
Ruby · 0 watchers
bugspots
Implementation of simple bug prediction hotspot heuristic
Matlab · 0 watchers
deeplearning-class-2011
Code for Deep Learning class at Google
Matlab · 0 watchers
DeepLearnToolbox
Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoencoders and vanilla Neural Nets. Each method has examples to get you started.
Java · 0 watchers
gephi
trunk
JavaScript · 0 watchers
hustoj
GitHub clone of SVN repo http://hustoj.googlecode.com/svn (cloned by http://svn2github.com/)
Processing · 0 watchers
infovis_sleep_cycle
Information Visualization Project Sleep Cycle
Python · 0 watchers
jieba
结巴中文分词
0 watchers
Math-Verify
Matlab · 0 watchers
MatlabNLP
Natural Language Processing tools for MATLAB
Python · 0 watchers
morb
Modular Restricted Boltzmann Machine (RBM) implementation using Theano
Java · 0 watchers
neo4j
Python · 0 watchers
nltk_book
NLTK Book
0 watchers
PersonalShare
Personal Stuff Share With Others
Python · 0 watchers
Projects
Trying to complete over 100 projects in various categories in Python. Fork to learn any new language.
Python · 0 watchers
pyalgotrade
Python Algorithmic Trading Library
Python · 0 watchers
pyner
Python interface to the Stanford Named Entity Recognizer
Python · 0 watchers
quantopian-algos
Library of algorithm scripts for Quantopian
C · 0 watchers
scikit-learn
scikit-learn: machine learning in Python
Python · 0 watchers
scikit-learn-tutorial
Applied Machine Learning in Python with scikit-learn
C++ · 0 watchers
shogun
The Shogun Machine Learning Toolbox (Source Code)
Python · 0 watchers
stanford-corenlp-python
Python wrapper for Stanford CoreNLP tools
Java · 0 watchers
storm
Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more
Python · 0 watchers
Theano
Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python
C++ · 0 watchers
vowpal_wabbit
John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm
charnugagoo

charnugagoo

🏢  NYU / Graduate Student
V2EX member #17829, joined on 2012-03-10 00:11:47 +08:00
CS Master@NYU
Parsing in NLP
ex ACM-ICPCer
ex Physicser
charnugagoo's recent replies
Jun 1, 2013
Replied to a topic by charnugagoo Python 多维list初始化的方便写法
@hhrmatata
我现在需要初始化一个五维list ToT,这两种写法都很丑陋啊。。
我觉得氛围是个问题呢,周围即使做同学开源的同学也不是很活跃,想起步的时候找不到线下组织
Apr 15, 2013
Replied to a topic by glume 问与答 求一本量子力学方面的书
取决于lz对知识的需求有多深,lz的数学功力以及lz的英语能力,不同的书适合于不同的情况。
《上帝掷筛子吗》是一本很有趣的科普读物,引人入胜,适合培养兴趣;
费恩曼物理学讲义需要一定物理基础,智商和时间(确实不宜懂),但是读完之后效果斐然,同时对数学的需求不是特别深。
正规教材,就很考验数学功底了,最浅是偏微分方程与复分析,最深能用到一些特泛函知识,可以先看一本数学物理方法http://book.douban.com/subject/1182629/。
中文教材我当时用过赵凯华老师的教材http://book.douban.com/subject/1509468/,以及高等教育出版社的蓝色皮的(找不到了,也可能是我记错了)。觉得写的至少对入门还不错。
英文教材,当时学长推荐的是: David J. Griffiths的,http://book.douban.com/subject/1706283/。还有一些blog推荐过Dirac的或者其他的。
如果英文够好又不想话很多时间的话,ls的coursera的课挺不错的。

另外,lz可以搜一搜别人写的荐书blog,学数学物理的都喜欢写文章评书,就跟学CS的喜欢写blog一样,网上有很多这样的帖子。

好运~~
很大的话,一般用bloomfilter, 如果再大或者是分布式爬虫的话就需要更高级的东西了。
PS:似乎还可以同时考虑做simhash,找出重复页面。
About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   1367 Online   Highest 6679   ·     Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.5 · 23ms · UTC 17:11 · PVG 01:11 · LAX 10:11 · JFK 13:11
♥ Do have faith in what you're doing.