从Python运行Stata Do文件
问题描述
我有一个Python
脚本,用于清理大型面板数据集(2,000,000+ observations
)并执行基本统计计算。
我发现其中一些任务更适合Stata
,并编写了一个包含必要命令的do文件。因此,我想在我的Python代码中运行一个.do文件。如何从Python
调用.do
文件?
解决方案
我认为@user229552指向了正确的方向。可以使用Python的subprocess
模块。下面是一个适用于我的Linux操作系统的示例。
假设您有一个名为pydo.py
的Python文件,其中包含以下内容:
import subprocess
## Do some processing in Python
## Set do-file information
dofile = "/home/roberto/Desktop/pyexample3.do"
cmd = ["stata", "do", dofile, "mpg", "weight", "foreign"]
## Run do-file
subprocess.call(cmd)
和一个名为pyexample3.do
的Stata DO文件,其中包含以下内容:
clear all
set more off
local y `1'
local x1 `2'
local x2 `3'
display `"first parameter: `y'"'
display `"second parameter: `x1'"'
display `"third parameter: `x2'"'
sysuse auto
regress `y' `x1' `x2'
exit, STATA clear
然后在终端窗口中执行pydo.py
即可正常工作。
您还可以定义一个Python函数并使用该函数:
## Define a Python function to launch a do-file
def dostata(dofile, *params):
## Launch a do-file, given the fullpath to the do-file
## and a list of parameters.
import subprocess
cmd = ["stata", "do", dofile]
for param in params:
cmd.append(param)
return subprocess.call(cmd)
## Do some processing in Python
## Run a do-file
dostata("/home/roberto/Desktop/pyexample3.do", "mpg", "weight", "foreign")
来自终端的完整呼叫,结果:
roberto@roberto-mint ~/Desktop
$ python pydo.py
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 12.1 Copyright 1985-2011 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 stata@stata.com
979-696-4601 (fax)
Notes:
1. Command line editing enabled
. do /home/roberto/Desktop/pyexample3.do mpg weight foreign
. clear all
. set more off
.
. local y `1'
. local x1 `2'
. local x2 `3'
.
. display `"first parameter: `y'"'
first parameter: mpg
. display `"second parameter: `x1'"'
second parameter: weight
. display `"third parameter: `x2'"'
third parameter: foreign
.
. sysuse auto
(1978 Automobile Data)
. regress `y' `x1' `x2'
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 2, 71) = 69.75
Model | 1619.2877 2 809.643849 Prob > F = 0.0000
Residual | 824.171761 71 11.608053 R-squared = 0.6627
-------------+------------------------------ Adj R-squared = 0.6532
Total | 2443.45946 73 33.4720474 Root MSE = 3.4071
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | -.0065879 .0006371 -10.34 0.000 -.0078583 -.0053175
foreign | -1.650029 1.075994 -1.53 0.130 -3.7955 .4954422
_cons | 41.6797 2.165547 19.25 0.000 37.36172 45.99768
------------------------------------------------------------------------------
.
. exit, STATA clear
来源:
http://www.reddmetrics.com/2011/07/15/calling-stata-from-python.html
http://docs.python.org/2/library/subprocess.html
http://www.stata.com/support/faqs/unix/batch-mode/
同时使用Python和Stata的不同途径可在
找到http://ideas.repec.org/c/boc/bocode/s457688.html
http://www.stata.com/statalist/archive/2013-08/msg01304.html
相关文章