如何使我的AJAX内容能够被谷歌抓取?
I've been working on a site that uses jQuery heavily and loads in content via AJAX like so:
我一直在一个使用jQuery的网站上工作,通过AJAX加载内容,比如:
$('#newPageWrapper').load(newPath + ' .pageWrapper', function() {
//on load logic
}
It has now come to my attention that Google won't index any dynamically loaded content via Javascript and so I've been looking for a solution to the problem.
现在我注意到谷歌不会通过Javascript索引任何动态加载的内容,所以我一直在寻找这个问题的解决方案。
I've read through Google's Making AJAX Applications Crawlable document what seems like 100 times and I still don't understand how to implement it (due in the most part to my limited knowledge of servers).
我已经阅读了谷歌的AJAX应用程序爬虫文档100次了,但我仍然不知道如何实现它(主要是因为我对服务器的了解有限)。
So my first question would be:
我的第一个问题是
- Is there a decent step-by-step tutorial out there that documents this from start to finish that you know of? I've tried to Google it and I'm not finding anything useful.
- 是否有一个好的一步一步的教程来记录你所知道的从开始到结束的过程?我试过了,但没有发现任何有用的东西。
And secondly, if there isn't anything out there yet, would anyone be able to explain:
第二,如果还没有任何东西,有人能解释吗?
How to 'Set up my server to handle requests for URLs that contain _escaped_fragment_'
如何“设置我的服务器来处理包含_escaped_fragment_的url请求”
How to implement HtmlUnit on my server to create an 'HTML snapshot' of the page to show to the crawler.
如何在我的服务器上实现HtmlUnit来创建页面的“HTML快照”以显示给爬虫。
I would be incredibly grateful if someone could shed some light on this for me, thanks in advance!
如果有人能给我解释一下,我会非常感激的,提前谢谢!
-Ben
本
3 个解决方案
#1
2
The best solution is to make a site that works with and without JavaScript. Read articles on Progressive enhancement.
最好的解决方案是创建一个不使用JavaScript的站点。阅读关于渐进增强的文章。
更多相关文章
- 【问题解决方案】ImportError: No module named 'pygal'
- Python 黏包及黏包解决方案
- Python网页静态爬虫
- python学习笔记(3)--爬虫基础教程1
- Python爬虫二(Urllib库的基本使用和高级用法)
- 安装Python及爬虫入门介绍
- 【python网络爬虫三】爬取动态数据及数据入库
- python爬虫爬取wallpapers最新壁纸
- python爬虫:爬取豌豆荚APP第一页数据信息(selenium)