cz88纯真IP库下载

zmx

2023 年 02 月 16 日

408 次浏览

暂无评论

937字数

MySQL

# 引言

纯真库仅限个人开发者或学术研究使用，不能用于商业产品。
如本篇文章旨在学习python爬虫微信公众号，如果损害了您的利益，请联系站长，予以删除。
</div>

# 准备工作

- python
- re
- request
- uniextract

当然可以使用*BeautifulSoup*等包，也许会比本方法简单。

# 代码

```python
import requests
import re
get = requests.get(
    "https://mp.weixin.qq.com/mp/appmsgalbum?&action=getalbum&album_id=2329805780276838401#wechat_redirect") # 这里是公众号的链接

html = get.text
findall = re.findall('''<script type="text/javascript">(.*?)</script>''', html,re.S)
for tal in findall:
    b = re.findall("url: '(.*?)',", tal, re.S)
    for url in b:
        if url != '':
            sub_html = requests.get(url).text
            print(re.findall("数据版本：(.*?)</span></p>",sub_html,re.S)[0][-8:], 
                  re.findall("下载地址：(.*?)</span>",sub_html,re.S)[0][9:]) # 提取关键字
```

# 后续工作

解压zip文件，通过uniextract解压exe安装包，解析dat文件
<div class="preview">
<div class="post-inser post box-shadow-wrap-normal">
<a href="https://www.zunmx.top/archives/353/" target="_blank" class="post_inser_a no-external-link no-underline-link">
<div class="inner-image bg" style="background-image: url(https://www.zunmx.top/usr/themes/handsome/assets/img/sj/7.jpg);background-size: cover;"></div>

<div class="inner-content" >
<p class="inser-title">qqwry 解析(python3) 并且dump 到 mysql</p>
<div class="inster-summary text-muted">
准备工作python 3.7 + IDEinternet致谢本项目借鉴于qqwry-py3，相关文件已附上地址：h...
</div>
</div>
</a>

</div>

</div>

# 截图

![image.png](https://www.zunmx.top/usr/uploads/2023/02/700420558.png)