Android利用jsoup爬虫爬网页数据(二)
16lz
2021-01-26
效果图太大了,我放到github上了,想看效果的点击以下链接:
效果图一
效果图二
效果图和上文是一样的,上文只是说了一下简单的,这里的稍微麻烦一点,因为上文的基本就是个列表,而且数据结构比较简单,这里就稍微麻烦一点,我先把html代码上传一下吧,这里截取有用的。
<div class="recipeCategory_sub_R clear"> <ul> <li> <span class="category_s1"> <b>猪蹄儿b> span> <span class="category_s2">两个span>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/LiaoJiu/" title="料酒的做法"> <b>料酒b> a> span> <span class="category_s2">一勺半span>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/SHENGJIANG/" title="生姜的做法"> <b>生姜b> a> span> <span class="category_s2">6gspan>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/DaSuan/" title="大蒜的做法"> <b>大蒜b> a> span> <span class="category_s2">4gspan>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/GuiPi/" title="桂皮的做法"> <b>桂皮b> a> span> <span class="category_s2">一块儿span>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/XiangYe/" title="香叶的做法"> <b>香叶b> a> span> <span class="category_s2">三四片儿span>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/BaJiao/" title="八角的做法"> <b>八角b> a> span> <span class="category_s2">两三个span>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/DingXiang/" title="丁香的做法"> <b>丁香b> a> span> <span class="category_s2">1gspan>li> <li> <span class="category_s1"> <a target="_blank" href="http://www.meishichina.com/YuanLiao/DaCong/" title="大葱的做法"> <b>大葱b> a> span> <span class="category_s2">一根span>li> <li> <span class="category_s1"> <b>小米辣b> span> <span class="category_s2">三四个span>li> ul> div> <div class="recipeCategory_sub_R mt30 clear"> <ul> <li> <span class="category_s1"> <a title="咸鲜" href="http://home.meishichina.com/recipe-type-do-cuisine-view-8.html" target="_blank">咸鲜a>span> <span class="category_s2">口味span>li> <li> <span class="category_s1"> <a title="煮" href="http://home.meishichina.com/recipe-type-do-technics-view-7.html" target="_blank">煮a>span> <span class="category_s2">工艺span>li> <li> <span class="category_s1"> <a title="一小时" href="http://home.meishichina.com/recipe-type-do-during-view-5.html" target="_blank">一小时a>span> <span class="category_s2">耗时span>li> <li> <span class="category_s1"> <a title="普通" href="http://home.meishichina.com/recipe-type-do-level-view-2.html" target="_blank">普通a>span> <span class="category_s2">难度span>li> ul> div> <div class="mo mt20"> <h3>猪蹄冻----高逼格新年宴客菜的做法步骤h3>div> <div class="recipeStep"> <ul> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109004407482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:1">div> <div class="recipeStep_word"> <div class="recipeStep_num">1div>准备两个猪蹄儿,上次做的猪蹄儿用的是肘子,今天用蹄儿记住对半切开成两半就可以,为啥捏不然剁碎了挑骨头渣子累死你,因为有碎骨头一切就碎了不好看明白了吧div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109099457482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:2">div> <div class="recipeStep_word"> <div class="recipeStep_num">2div>锅里上入适量水,放入猪蹄儿,一勺半料酒,三片生姜,煮开后后撇去浮沫再用温水洗净div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109139157482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:3">div> <div class="recipeStep_word"> <div class="recipeStep_num">3div>洗净后检查下猪蹄上面是否有毛毛,我是用刮眉毛的刀刀一下子就消灭的干干净净滴,哈哈div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109165297482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:4">div> <div class="recipeStep_word"> <div class="recipeStep_num">4div>来准备一些配料也不是绝对的: 桂皮,香叶,八角,大葱,丁香。。。。这些材料最好是用一个纱布包起来,我家用完了,就木有用div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109206797482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:5">div> <div class="recipeStep_word"> <div class="recipeStep_num">5div>我一直觉得电压力锅煮方便又快当然高压力锅也是可以的,电压力锅不费水,超过一点点就可以加入少许香料,和大葱段,生姜片,少许食盐div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109245617482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:6">div> <div class="recipeStep_word"> <div class="recipeStep_num">6div>我大概煮了一个来小时高压锅上汽后大概半小时后左右就可以了div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109358907482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:7">div> <div class="recipeStep_word"> <div class="recipeStep_num">7div>舀去上面的浮沫div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109399577482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:8">div> <div class="recipeStep_word"> <div class="recipeStep_num">8div>把猪蹄捞起来凉凉,在用豆浆的漏勺舀取所有的香料和碎沫渣子所以用细一点的漏勺会弄得比较彻底div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109429647482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:9">div> <div class="recipeStep_word"> <div class="recipeStep_num">9div>汤稍凉后舀去最上面的油脂div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109457347482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:10">div> <div class="recipeStep_word"> <div class="recipeStep_num">10div>稍凉后用手带上一次性手套脱去骨头,这是一个细致的活儿,一个个慢慢来全部脱掉后再仔细的确认有木有碎骨头小渣子之类的最后用刀切成小段div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109709027482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:11">div> <div class="recipeStep_word"> <div class="recipeStep_num">11div>加入刚刚的汤里,,如果还是怕里面有骨头渣子再用漏勺捞起来用筷子在仔细挑挑,免得切的时候碎掉就欲哭无泪啦这个时候转到小汤锅里中小火慢慢煮div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109607987482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:12">div> <div class="recipeStep_word"> <div class="recipeStep_num">12div>直到煮到三分之一的时候就可以熄火啦,记住哦剩下的汤多一点就会软汤少一点就会硬一些div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109644707482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:13">div> <div class="recipeStep_word"> <div class="recipeStep_num">13div>稍凉后放入一个你觉得切起来形状会好看的容器里,没有热气后盖上密封的盖子放入冰箱冷藏室【不是冷冻室哦】div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816109969587482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:14">div> <div class="recipeStep_word"> <div class="recipeStep_num">14div>第二天就可以开吃啦,很美有木有,超级的满足感啊div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816110026397482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:15">div> <div class="recipeStep_word"> <div class="recipeStep_num">15div>用刀切成均匀的片儿就可以啦div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816110079187482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:16">div> <div class="recipeStep_word"> <div class="recipeStep_num">16div>一片片的摆好就可以上桌了,清爽不腻,对于女人懂得,满满的胶原蛋白div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816110125747482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:17">div> <div class="recipeStep_word"> <div class="recipeStep_num">17div>人生也像坐火车一样,过去的景色那样美,让你流连不舍,可是你总是需要前进,会离开, 然后你告诉自己,没关系,我以后一定还会再来看,可其实,往往你再也不会回去。流逝的 时间,退后的风景,邂逅的人,终究是渐行渐远。div>li> <li> <div class="recipeStep_img"> <img src="http://i3.meishichina.com/attachment/recipe/2016/12/13/2016121314816110199097482619.jpg@!p320" alt="猪蹄冻----高逼格新年宴客菜的做法步骤:18">div> <div class="recipeStep_word"> <div class="recipeStep_num">18div>我就知道你们会问这个蘸料的做法:少许生姜末大蒜末,葱末,少许辣椒油,香醋,生抽,蒸鱼鼓油,香芝麻 ,少许鸡粉,小米辣碎,搅匀就可以啦div>li> ul> div>
这里需要注意的是div并不是根的div而是嵌套好多层的div,上文是根的div,获取数据也比较简单,这里稍微麻烦一点。
核心代码:
try { Document document = Jsoup.connect("http://home.meishichina.com/recipe-304301.html").get(); //菜名 Elements recipeDeImgBox = document.getElementsByClass("recipe_De_imgBox"); String title = recipeDeImgBox.select("a").attr("title");// Log.d("jsoup", title); //顶部大图 String imgUrl = recipeDeImgBox.select("a").select("img").attr("src");// Log.d("jsoup", imgUrl); //食材明细 Elements fodderElement = document.getElementsByClass("mo mt20"); String fodderTitle = fodderElement.get(0).select("h3").text();// Log.d("jsoup", fodderTitle); //具体食材 Elements fodderItemElement = document.getElementsByClass("recipeCategory_sub_R clear"); int fodderSize = fodderItemElement.select(".category_s1").size(); for (int i = 0; i < fodderSize; i++) { String fodderName = fodderItemElement.select(".category_s1").get(i).select("b").text();// Log.d("jsoup", fodderName);//食材名称 String fodderMany = fodderItemElement.select(".category_s2").get(i).text();// Log.d("jsoup", fodderMany);//食材计量 } //食材工艺 Elements fodderArtElement = document.getElementsByClass("recipeCategory_sub_R mt30 clear"); int fodderArtSize = fodderArtElement.select(".category_s1").size(); for (int i = 0; i < fodderArtSize; i++) { String fodderArtTitle = fodderItemElement.select(".category_s1").get(i).select("a").attr("title");// Log.d("jsoup", fodderArtTitle); String fodderArtDetail = fodderItemElement.select(".category_s2").get(i).text();// Log.d("jsoup", fodderArtDetail); } //步骤讲解 Elements recipeStepElement = document.getElementsByClass("recipeStep_word"); Elements recipeStepImgElement = document.getElementsByClass("recipeStep_img"); int recipeStepSize = recipeStepElement.size(); for (int i = 0; i < recipeStepSize; i++) { String recipeImg = recipeStepImgElement.get(i).select("img").attr("src");// Log.d("jsoup", recipeImg);//步骤图片 String recipeStep = recipeStepElement.get(i).text();// Log.d("jsoup", recipeStep);//步骤说明 } } catch (Exception e) { e.printStackTrace(); }
这里数据获取需要一定的规则,简单来说相应的键对应着相应的值,也就是我们常说的键值对,如果没有键只有值我就直接的totext了。
剩下的在代码里面都有注释,也写的很详细
activity:
package com.fanyafeng.recreation.activity;import android.os.Bundle;import android.os.Handler;import android.os.Message;import android.support.design.widget.FloatingActionButton;import android.support.design.widget.Snackbar;import android.support.v7.app.AppCompatActivity;import android.support.v7.widget.GridLayoutManager;import android.support.v7.widget.RecyclerView;import android.support.v7.widget.Toolbar;import android.util.Log;import android.view.LayoutInflater;import android.view.View;import android.widget.LinearLayout;import android.widget.TextView;import com.facebook.drawee.view.SimpleDraweeView;import com.fanyafeng.recreation.R;import com.fanyafeng.recreation.BaseActivity;import com.fanyafeng.recreation.adapter.MenuDetailAdapter;import com.fanyafeng.recreation.bean.FodderArtBean;import com.fanyafeng.recreation.bean.FodderBean;import com.fanyafeng.recreation.bean.MenuDetailBean;import com.fanyafeng.recreation.fragment.ThreeFragment;import com.fanyafeng.recreation.util.ControllerListenerUtil;import com.fanyafeng.recreation.util.DpPxConvert;import com.fanyafeng.recreation.util.MyUtils;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.select.Elements;import java.util.ArrayList;import java.util.List;//需要搭配Baseactivity,这里默认为Baseactivity,并且默认BaseActivity为包名的根目录public class MenuDetailActivity extends BaseActivity { private String url; private String myTitle; private RecyclerView rvMenuDetail; private List menuDetailBeanList = new ArrayList<>(); private MenuDetailAdapter menuDetailAdapter; private View headerView; private SimpleDraweeView sdvMenuDetailHeader; private LinearLayout layoutFodder; private LinearLayout layoutFodderArt; private TextView tvRVTitle; private List fodderBeanList = new ArrayList<>(); private List fodderArtBeanList = new ArrayList<>(); @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_menu_detail); //这里默认使用的是toolbar的左上角标题,如果需要使用的标题为中心的采用下方注释的代码,将此注释掉即可 title = getString(R.string.title_activity_menu_detail); url = getIntent().getStringExtra("url"); Log.d("jsoup", url); initView();// initData(); Thread thread = new Thread(new LoadThread()); thread.start(); } //初始化UI控件 private void initView() { rvMenuDetail = (RecyclerView) findViewById(R.id.rvMenuDetail); rvMenuDetail.setLayoutManager(new GridLayoutManager(this, 1, GridLayoutManager.VERTICAL, false)); menuDetailAdapter = new MenuDetailAdapter(this, menuDetailBeanList); headerView = menuDetailAdapter.setHeaderView(R.layout.header_menu_detail_layout, rvMenuDetail); layoutFodder = (LinearLayout) headerView.findViewById(R.id.layoutFodder); layoutFodderArt = (LinearLayout) headerView.findViewById(R.id.layoutFodderArt); tvRVTitle = (TextView) headerView.findViewById(R.id.tvRVTitle); sdvMenuDetailHeader = (SimpleDraweeView) headerView.findViewById(R.id.sdvMenuDetailHeader); rvMenuDetail.setAdapter(menuDetailAdapter); } private void initData() { try { Document document = Jsoup.connect(url).get(); //菜名 Elements recipeDeImgBox = document.getElementsByClass("recipe_De_imgBox"); myTitle = recipeDeImgBox.select("a").attr("title"); Message titleMessage = Message.obtain(); titleMessage.what = 1; handler.sendMessage(titleMessage);// Log.d("jsoup", title); //顶部大图 String imgUrl = recipeDeImgBox.select("a").select("img").attr("src");// Log.d("jsoup", imgUrl); Message headerImgUrlMessage = Message.obtain(); headerImgUrlMessage.what = 2; headerImgUrlMessage.obj = imgUrl; handler.sendMessage(headerImgUrlMessage); //食材明细 Elements fodderElement = document.getElementsByClass("mo mt20");// String fodderTitle = fodderElement.get(0).select("h3").text();// Log.d("jsoup", fodderTitle); //具体食材 Elements fodderItemElement = document.getElementsByClass("recipeCategory_sub_R clear"); int fodderSize = fodderItemElement.select(".category_s1").size(); for (int i = 0; i < fodderSize; i++) { String fodderName = fodderItemElement.select(".category_s1").get(i).select("b").text();// Log.d("jsoup", fodderName);//食材名称 String fodderMany = fodderItemElement.select(".category_s2").get(i).text();// Log.d("jsoup", fodderMany);//食材计量 FodderBean fodderBean = new FodderBean(); fodderBean.setFooderName(fodderName); fodderBean.setFooderMany(fodderMany); fodderBeanList.add(fodderBean); }// Log.d("jsoup", fodderBeanList.toString()); int weight = fodderSize % 3 == 0 ? fodderSize / 3 : fodderSize / 3 + 1; for (int i = 0; i < weight; i++) { LinearLayout linearLayout = new LinearLayout(MenuDetailActivity.this); linearLayout.setOrientation(LinearLayout.HORIZONTAL); linearLayout.setWeightSum(3); for (int j = 0; j < 3; j++) { View view = LayoutInflater.from(MenuDetailActivity.this).inflate(R.layout.item_fodder_layout, null); LinearLayout.LayoutParams layoutParams = new LinearLayout.LayoutParams(LinearLayout.LayoutParams.MATCH_PARENT, LinearLayout.LayoutParams.WRAP_CONTENT); layoutParams.weight = 1f; if ((3 * i + j + 1) <= fodderSize) { FodderBean fodderBean = fodderBeanList.get(3 * i + j); TextView tvFodderTitle = (TextView) view.findViewById(R.id.tvFodderTitle); tvFodderTitle.setText(fodderBean.getFooderName()); TextView tvFodderNum = (TextView) view.findViewById(R.id.tvFodderNum); tvFodderNum.setText(fodderBean.getFooderMany()); } linearLayout.addView(view, layoutParams); } Message message3 = Message.obtain(); message3.obj = linearLayout; message3.what = 3; handler.sendMessage(message3);// layoutFodder.addView(linearLayout);子线程加载太多控件更新ui } //食材工艺 Elements fodderArtElement = document.getElementsByClass("recipeCategory_sub_R mt30 clear"); int fodderArtSize = fodderArtElement.select(".category_s1").size(); for (int i = 0; i < fodderArtSize; i++) { String fodderArtTitle = fodderArtElement.select(".category_s1").get(i).select("a").attr("title"); Log.d("jsoup", fodderArtTitle); String fodderArtDetail = fodderArtElement.select(".category_s2").get(i).text(); Log.d("jsoup", fodderArtDetail); FodderArtBean fodderArtBean = new FodderArtBean(); fodderArtBean.setFodderArtTitle(fodderArtTitle); fodderArtBean.setFodderArtDetail(fodderArtDetail); fodderArtBeanList.add(fodderArtBean); } Log.d("jsoup", fodderArtBeanList.toString()); int Aweight = fodderArtSize % 3 == 0 ? fodderArtSize / 3 : fodderArtSize / 3 + 1; for (int i = 0; i < Aweight; i++) { LinearLayout linearLayout = new LinearLayout(MenuDetailActivity.this); linearLayout.setOrientation(LinearLayout.HORIZONTAL); linearLayout.setWeightSum(3); for (int j = 0; j < 3; j++) { View view = LayoutInflater.from(MenuDetailActivity.this).inflate(R.layout.item_fodder_layout, null); LinearLayout.LayoutParams layoutParams = new LinearLayout.LayoutParams(LinearLayout.LayoutParams.MATCH_PARENT, LinearLayout.LayoutParams.WRAP_CONTENT); layoutParams.weight = 1f; if ((3 * i + j + 1) <= fodderArtSize) { FodderArtBean fodderArtBean = fodderArtBeanList.get(3 * i + j); TextView tvFodderTitle = (TextView) view.findViewById(R.id.tvFodderTitle); tvFodderTitle.setText(fodderArtBean.getFodderArtTitle()); TextView tvFodderNum = (TextView) view.findViewById(R.id.tvFodderNum); tvFodderNum.setText(fodderArtBean.getFodderArtDetail()); } linearLayout.addView(view, layoutParams); } Message message4 = Message.obtain(); message4.obj = linearLayout; message4.what = 4; handler.sendMessage(message4);// layoutFodder.addView(linearLayout);子线程加载太多控件更新ui } //步骤讲解 Elements recipeStepElement = document.getElementsByClass("recipeStep_word"); Elements recipeStepImgElement = document.getElementsByClass("recipeStep_img"); int recipeStepSize = recipeStepElement.size(); for (int i = 0; i < recipeStepSize; i++) { String recipeImg = recipeStepImgElement.get(i).select("img").attr("src");// Log.d("jsoup", recipeImg);//步骤图片 String recipeStep = recipeStepElement.get(i).text();// Log.d("jsoup", recipeStep);//步骤说明 MenuDetailBean menuDetailBean = new MenuDetailBean(); menuDetailBean.setMenuDetailImg(recipeImg); menuDetailBean.setMenuDetail(recipeStep); menuDetailBeanList.add(menuDetailBean); Message message = Message.obtain(); message.what = 0; handler.sendMessage(message); } } catch (Exception e) { e.printStackTrace(); } } class LoadThread implements Runnable { @Override public void run() { initData(); } } Handler handler = new Handler() { @Override public void handleMessage(Message msg) { super.handleMessage(msg); switch (msg.what) { case 0: menuDetailAdapter.notifyDataSetChanged(); break; case 1: toolbar.setTitle(myTitle); tvRVTitle.setText(myTitle); break; case 2: ControllerListenerUtil.setControllerListener(sdvMenuDetailHeader, (String) msg.obj, MyUtils.getScreenWidth(MenuDetailActivity.this)); break; case 3: LinearLayout linearLayout = (LinearLayout) msg.obj; layoutFodder.addView(linearLayout); break; case 4: LinearLayout linearLayout1 = (LinearLayout) msg.obj; layoutFodderArt.addView(linearLayout1); break; } } };}
xml就是个列表没有刷新,就不上了
更多相关文章
- 防sql注入
- 解决S5pv210 adb push u-boot.bin /system 的失败问题
- Leetcode: Android(安卓)Unlock Patterns
- 关于在使用gson解析json时建模与规范冲突的问题
- Android中使用javah生成jni头文件的正确方法
- AndroidStudio3.x 打开Android(安卓)Device Monitor正确做法
- android从fragment进入activity再返回实现刷新fragment的做法
- android 截屏的三种方法
- android的ListView点击item使item展开的做法的实现代码