博客 / 詳情

返回

python-爬取中國天氣網7天天氣並保存至本地

1.中國天氣網
http://www.weather.com.cn/weather/101010100.shtml

2.分析頁面
image.png

3.瀏覽器-F12-定位查看元素嵌套關係
image.png

4.導入需要的庫

import requests
from bs4 import BeautifulSoup
import re

5.代碼部分

result_list_wt = []

def get_page(url):
    try:
        kv = {'user-agent':'Mozilla/5.0'}
        r = requests.get(url,headers = kv)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return 'error'

def parse_page(html, return_list):
    soup = BeautifulSoup(html, 'html.parser')
    day_list = soup.find('ul', 't clearfix').find_all('li')
    for day in day_list:
        date = day.find('h1').get_text()
        wea = day.find('p',  'wea').get_text()
        if day.find('p', 'tem').find('span'):
                hightem = day.find('p', 'tem').find('span').get_text()
        else:
                hightem = ''
        lowtem = day.find('p', 'tem').find('i').get_text() 
        win = re.findall('(?<= title=").*?(?=")', str(day.find('p','win').find('em')))
        wind = '-'.join(win)
        level = day.find('p', 'win').find('i').get_text()
        return_list.append([date, wea, lowtem, hightem, wind, level])

def print_res(return_list):
    tplt = '{0:<10}\t{1:^10}\t{2:^10}\t{3:{6}^10}\t{4:{6}^10}\t{5:{6}^5}' 
    result_list_wt.append(tplt.format('日期', '天氣', '最低温', '最高温', '風向', '風力',chr(12288))+"\n") 
    for i in return_list:
        result_list_wt.append(tplt.format(i[0], i[1],i[2],i[3],i[4],i[5],chr(12288))+"\n")
        
def main():
    # 城市-城市碼txt
    files = open('city_list.txt',"r")
    city_name_id = files.readlines()
    try:
        # 獲取txt-list
        for line in city_name_id:
            name_id = line.split('-')[1].replace("['","").replace("\n","")
            url = 'http://www.weather.com.cn/weather/'+name_id+'.shtml' 
            city_name = line.split('-')[0].replace("['","").replace("\n","")
            city_china = "\n"+"城市名 : "+city_name+"\n"
            result_list_wt.append(city_china)
            html = get_page(url)
            wea_list = []
            parse_page(html, wea_list)
            print_res(wea_list)
        files.close()
    except:
        print("error")
    # 將獲取結果寫入到文件內    
    msgs = ''.join(result_list_wt)
    print(msgs)
    with open('weather.China.txt',"w+") as file:
        file.write(msgs)
        
if __name__ == '__main__':
    main()

6.city_list.txt

上海-101020100
蘇州-101190401
無錫-101190201
南京-101190101
鎮江-101190301
宜興-101190203
揚州-101190601
常州-101191101
杭州-101210101
寧波-101210401
義烏-101210904
温州-101210701
台州-101210601
湖州-101210201
金華-101210901
紹興-101210507

7.用途

1.推送到企業微信
2.推送到叮叮
3.可定製@固定人員或推送到指定羣組
4.變成機器人提醒
5.定時抓取判斷當前城市的天氣情況,應用到不同的業務場景

8.寫入本地文件內容
image.png

user avatar
0 位用戶收藏了這個故事!

發佈 評論

Some HTML is okay.