对excel用户最友好的、交互能力最强的 数据分析编程语言 | 润乾 -九游会登陆

新颖的网格编程

像excel的网格编程

像excel的网格编程
sql调试极繁琐
spl便于调试的ide
python调试也麻烦
select code, max(con_rise) as longest_up_days
from (
select code, count(*) as con_rise
from (
select code, dt,
sum(updown_flag) over (partition by code order by code, dt) as no_up_days
from (
select code, dt,
case when cl > lag(cl) over (partition by code order by code, dt) then 0
else 1 end as updown_flag
from stock
)
)
group by code, no_up_days
)
group by code

拆分sql执行中间步骤

print大法输出中间结果

强交互性适应探索式分析

xll插件辅助excel

spl代码直接写在excel中: 找出股票连涨超过5天的区间
=spl("=e(?1).sort(code,dt).group@i(code!=code[-1]||cl < cl[-1]).select(~.len()>=5).conj()",a1:d253)

简洁强大的代码

常规运算齐全简单

特有集合有序运算

计算每支股票的最长连涨天数

spl

a
2stockrecords.xlsx
3=t(a1).sort(dt)
4=a2.group(code;~.group@i(cl < cl[-1]).max(~.len()):max_increase_days)

特别擅长完成次序相关、移动窗口、跨行运算等复杂场景,比sql、python更简单

python

import pandas as pd stock_file = "stockrecords.txt" stock_info = pd.read_csv(stock_file,sep="\t") stock_info.sort_values(by=['code','dt'],inplace=true) stock_group = stock_info.groupby(by='code') stock_info['label'] = stock_info.groupby('code')['cl'].diff().fillna(0).le(0).astype(int).cumsum() max_increase_days = {} for code, group in stock_info.groupby('code'): max_increase_days[code] = group.groupby('label').size().max() – 1 max_rise_df = pd.dataframe(list(max_increase_days.items()), columns=['code', 'max_increase_days'])

sql

select code, max(con_rise) as longest_up_days from ( select code, count(*) as con_rise from ( select code, dt, sum(updown_flag) over (partition by code order by code, dt) as no_up_days from ( select code, dt, case when cl > lag(cl) over (partition by code order by code, dt) then 0 else 1 end as updown_flag from stock ) ) group by code, no_up_days ) group by code

特有集合有序运算

计算每支股票的最长连涨天数

spl

a
2stockrecords.xlsx
3=t(a1).sort(dt)
4=a2.group(code;~.group@i(cl < cl[-1]).max(~.len()):max_increase_days)

特别擅长完成次序相关、移动窗口、跨行运算等复杂场景,比sql、python更简单

sql

select code, max(con_rise) as longest_up_days
from (
    select code, count(*) as con_rise
    from (
        select code, dt,  sum(updown_flag) over (partition by code order by code, dt) as no_up_days
        from (
            select code, dt, 
                    case when cl > lag(cl) over (partition by code order by code, dt)  then 0
                    else 1 end as updown_flag
            from stock
        )
    )
    group by code, no_up_days
)
group by code	

python

import pandas as pd
stock_file = "stockrecords.txt"
stock_info = pd.read_csv(stock_file,sep="\t")
stock_info.sort_values(by=['code','dt'],inplace=true)
stock_group = stock_info.groupby(by='code')
stock_info['label'] = stock_info.groupby('code')['cl'].diff().fillna(0).le(0).astype(int).cumsum()
max_increase_days = {}
for code, group in stock_info.groupby('code'):
    max_increase_days[code] = group.groupby('label').size().max() – 1
max_rise_df = pd.dataframe(list(max_increase_days.items()), columns=['code', 'max_increase_days'])

简易的大数据与并行支持

大数据

内存
a
1stockrecords.txt
2=file(a1).import@t().sort(code,dt)
3=a2.group(code;~.group@i(cl < cl[-1]).max(~.len()):mi)
外存
a
1stockrecords.txt
2=file(a1).cursor@t().sort(code,dt)
3=a2.group(code;~.group@i(cl < cl[-1]).max(~.len()):mi)

并行计算

串行
a
1stockrecords.txt
2=file(a1).cursor@t().sortx(code,dt)
3=a2.group@i(code!=code[-1]||cl< cl[-1])
4=a3.select(~.len()>=5).conj()
并行
a
1stockrecords.txt
2=file(a1).cursor@tm().sortx(code,dt)
3=a2.group@i(code!=code[-1]||cl< cl[-1])
4=a3.select(~.len()>=5).conj()

轻量又便携

网站地图