How to merge datasets in Stata conditionally?

What are some good methods to browse datasets in python like the Data Editor in Stata or R?

Steve Byrnes at Quora Visit the source

Was this solution helpful to you?

Other answers

pandas has data frames which are very similar to data frames in R. I recommend watching this http://vimeo.com/59324550 and browsing the http://pandas.pydata.org/pandas-docs/stable/index.html.

Kevin Lin

http://ipython.org/. Specifically the http://ipython.org/notebook.html (which works inside a browser). It's great for both exploratory data analysis (e.g. browsing datasets) and then sharing your work (in an editable and reproducible way). Revolutionary stuff. You can look at a DataFrame as a scrollable html table (with a plugin you can even edit the table like you would a spreadsheet) so it doesn't take the entire screen yet still look in detail at every row/cell. I would usually read the data in (e.g. as a csv), if it's "bigish" I'd look at df.head() or df.head(100), df.tail() and then manually see what cleanup may need to be done (mostly) on a column-by-column basis, some examples: - dtype, e.g. string column should be ints (or float), look at dtypes with .dtypes, coerce with coerce_objects or astype. - normalization, e.g. similar strings should match, if it's a Categorical look at .unique() - extracting multiple columns from a single column, e.g. using str.extract - string or multiple columns to a datetime, using to_datetime You can preview function documentation as you type, etc. etc., as you would expect with a "proper" IDE... and write notes in markdown alongside your code. It's also used to build the pandas online documentation (via the IPython sphinx directive)! https://www.wakari.io/nb/url///wakari.io/static/notebooks/Lecture_0_Scientific_Computing_with_Python.ipynb. http://nbviewer.ipython.org/github/jvns/talks/blob/master/pyconca2013/pistes-cyclables.ipynb. p.s. it's bundled in with Continuum's Anaconda (I highly recommend this free tool for painlessly installing numpy, pandas, scipy and friends), and is an easy pip install ipython away for everyone else!

Andy Hayden

Python’s built-in programming interface hasn’t the function of variable and datasets browsing. But Portable python, the third-party IDE, supports this function. Because Python’s way of programming is text editing, the browsing interface of Portable python includes all variables and datasets, which makes it a little difficult to find one of them. It is also inconvenient because you cannot copy the content of the dataset you are viewing. In view of this, I suggest you try esProc if structured and semi-structured data are involved in programming. With esProc’s cellset-style programming, you just need to click the cellset to check its execution result. The display mode of datasets, which is similar to the Excel, is easy to use.

David Jin

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.