【对比python】日志处理1 | 润乾 -九游会登陆
任务:每三行记录一条日志,把日志整理成结构化文件
python
1 | import pandas as pd |
2 | import numpy as np |
3 | log_file = 'e://txt//access_log.txt' |
4 | log_info = pd.read_csv(log_file,header=none) |
5 | log_g=log_info.groupby(log_info.index//3) |
6 | rec_list = [] |
7 | for i,g in log_g: |
8 | rec = g.values.reshape(1*3) |
9 | rec[1] = rec[1].split(":")[-1].replace("#","") |
10 | rec="\t".join(rec) |
11 | rec = np.array(rec.split("\t")) |
12 | rec = rec[[6,7,0,1,3,4,8,5]] |
13 | rec_list.append(rec) |
14 | rec_df = pd.dataframe(rec_list,columns=["userid","uname","ip","time","url","brower","location","module"]) |
15 | print(rec_df) |
集算器
a | ||
1 | e://txt//access_log.txt | |
2 | =file(a1).import@s() | |
3 | =a2.group((#-1)\3) | |
4 | =a3.(~.(_1).concat("\t").array("\t")) | |
5 | =a4.new(~(7):userid,~(8):uname,~(1):ip,~(2):time,~(4):url,~(5):brower,~(9):location,left(~(6).array("\:")(2),-1):module) |
有了按行号分组的机制,就可以用循环每次处理一组数据,简化难度。