【对比python】日志处理2 | 润乾 -九游会登陆

任务:每条日志不定行,每行都有相同的标记表示是一条记录。

python

1 import pandas as pd
2 log_file = 'e://txt//indefinite _info.txt'
3 log_info = pd.read_csv(log_file,header=none)
4 log_g = log_info.groupby(log_info[0].apply(lambda x:x.split("\t")[0]),sort=false)
5 columns = ["userid","gender","age","salary","province","musicid","watch_time","time"]
6 df_dic = {}
7 for c in columns:
8     df_dic[c]=[]
9 for index,group in log_g:
10     rec_dic = {}
11     rec = group.values.flatten()
12     rec = '\t'.join(rec).split("\t")
13     for r in rec:
14         v = r.split(":")
15         rec_dic[v[0]]=v[1]
16     for col in columns:
17         if col not in rec_dic.keys():
18             df_dic[col].append(none)
19         else:
20             df_dic[col].append(rec_dic[col])
21 df = pd.dataframe(df_dic)
22 print(df)

集算器

  a  
1 e://txt//indefinite _info.txt  
2 =file(a1).import@s()  
3 [userid,gender,age,salary,province,musicid,watch_time,time]  
4 =a2.group@o(_1.array("\t")(1))  
5 =a4.(~.(_1.array("\t")).conj().id().align(a3,~.array("\:")(1)).(~.array("\:")(2))).conj()  
6 =create(${a3.concat@c()}).record(a5)  

集算器的归并分组方式和特殊的对齐运算使的日志整理轻松写意。

网站地图