想用golang下载妹子图吗?点进来看看吧!很方便
闲来无趣,就想着看用golang来做点什么事情,这不,到处都是python下载妹子图,我就想着用golang来弄一个下载妹子图的简单小工具
简单分析页面结构
访问https://www.meizitu.com 页面后, 进入到详情页面,可以看到url变为https://www.meizitu.com/a/5511.html ,我们将url中的数字任意修改,发现都能访问,那么,我们暂且就通过手动输入页面索引的方式来访问页面。
我们再看看页面结构,通过google浏览器的开发者工具很容易就可以看到:
下载图片
刚才我们已经大致的分析了页面结构,接下来 ,我们就开始简单的实现图片下载的功能,在这里我们选用了:
colly:用于采集页面和图片
uuid:生产UUID
cli:命令行工具包
1、我们先初始化一个队列,用于存放需要访问的url
1 2 3 4
| var q, _ = queue.New( 2, // Number of consumer threads &queue.InMemoryQueueStorage{MaxSize: 10000}, // Use default queue storage )
|
2、初始化需要下载的页面,我用命令行的方式来决定起始页
1 2 3 4 5 6 7 8 9 10 11 12
| if args := context.Args(); len(args) > 0 { return fmt.Errorf("invalid command: %q", args.Get(0)) } start := context.Int("s") log.Println("起始页:", start) end := context.Int("e") log.Println("截止页:", end) for i := start; i <= end; i++ { url := fmt.Sprintf("https://www.meizitu.com/a/%d.html", i) q.AddURL(url) }
|
3、初始化一个colly来处理队列中的url,我在OnHTML中,查找页面的postContent>img的dom节点,获取图片的路径,又放在队列中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| c := colly.NewCollector()
c.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" c.OnHTML(".postContent", func(e *colly.HTMLElement) { //e.Request.Visit(e.Attr("href")) e.ForEach("img", func(i int, element *colly.HTMLElement) { //e.Request.Visit(element.Attr("src")) q.AddURL(element.Attr("src")) }) }) c.OnResponse(func(resp *colly.Response) { if strings.Contains(resp.Headers.Get("Content-Type"), "image/jpeg") { download(resp.Body) } })
c.OnRequest(func(r *colly.Request) { fmt.Println("Visiting", r.URL) }) q.Run(c)
|
最后的效果:
代码很简单,如果需要可以去https://github.com/eyiadmin/meizitu 看看,如果想下载图片,就直接下载exe文件。在命令行输入main -s 1 -e 100即可下载
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| main -s 100 -e 108 2019/10/07 14:26:42 起始页: 100 2019/10/07 14:26:43 截止页: 108 Visiting https://www.meizitu.com/a/100.html Visiting https://www.meizitu.com/a/101.html Visiting https://www.meizitu.com/a/102.html Visiting https://www.meizitu.com/a/103.html Visiting https://www.meizitu.com/a/104.html Visiting https://www.meizitu.com/a/105.html Visiting https://www.meizitu.com/a/106.html Visiting https://www.meizitu.com/a/107.html Visiting https://www.meizitu.com/a/108.html Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/01.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/02.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/03.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/04.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/05.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/06.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/01.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/07.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/08.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/09.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/02.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/03.jpg Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/04.jpg
|