導覽
近期變更
隨機頁面
新手上路
新頁面
優質條目評選
繁體
不转换
简体
繁體
3.142.255.23
登入
工具
閱讀
檢視原始碼
特殊頁面
頁面資訊
求真百科歡迎當事人提供第一手真實資料,洗刷冤屈,終結網路霸凌。
檢視 Crawler software 的原始碼
←
Crawler software
前往:
導覽
、
搜尋
由於下列原因,您沒有權限進行 編輯此頁面 的動作:
您請求的操作只有這個群組的使用者能使用:
用戶
您可以檢視並複製此頁面的原始碼。
Crawler software is a common software used to collect a large amount of information, and crawling information by using vulnerabilities is called malicious crawler. Web crawler is a program that automatically extracts web pages. It downloads Web pages for search engines from the world wide web. It is an important part of search engines. The traditional crawler starts from the URL of one or several initial web pages to obtain the URL on the initial web page. In the process of crawling the web page, it continuously extracts new URLs from the current page and puts them into the queue until certain stop conditions of the system are met. The workflow of the focused crawler is complex. It needs to filter the links irrelevant to the topic according to a certain web page analysis algorithm, keep the useful links and put them into the URL queue waiting to be captured. Then, it will select the next web page URL from the queue according to a certain search strategy, and repeat the above process until it reaches a certain condition of the system. In addition, all web pages captured by the crawler will be stored by the system, analyzed, filtered and indexed for future query and retrieval; For focused crawlers, the analysis results obtained in this process may also provide feedback and guidance for the future capture process.
返回「
Crawler software
」頁面