How to plot a histogram?

How to plot a histogram

  • My program imports these: import requests import demjson import pandas as pd from pandas import DataFrame import pylab pylab.show() I have a dataframe which if I print out looks like this: Strike COI POI 0 50.00 927 1694 1 55.00 394 1898 2 60.00 2042 4438 3 65.00 642 3696 4 70.00 3169 3216 5 75.00 2529 3222 6 80.00 6268 14029 7 85.00 3988 6241 8 87.50 356 1516 9 90.00 15676 14345 10 92.50 1309 2498 11 95.00 3303 11391 12 97.50 1074 1472 13 100.00 64930 19513 14 105.00 10953 9286 15 110.00 19956 13008 16 115.00 13956 12932 17 120.00 23440 9240 18 125.00 12167 7467 19 130.00 23531 10168 20 135.00 9567 2637 21 140.00 18967 6854 22 145.00 7890 5176 23 150.00 21516 8079 24 155.00 3137 267 25 160.00 4115 432 26 165.00 1079 205 27 170.00 4341 785 28 175.00 6277 1631 29 180.00 1805 35 30 185.00 906 136 31 190.00 1984 377 32 195.00 3539 268 Sometimes there are zero values like this Strike COI POI 0 95.00 53 663 1 100.00 16 595 2 105.00 6 377 3 110.00 56 1217 4 115.00 174 994 5 120.00 631 3227 6 125.00 701 1031 7 130.00 2678 833 8 135.00 1921 1049 9 140.00 1238 10 10 160.00 1486 0 11 165.00 1900 0 Unfortunately sometimes the Strike is a float like this: Strike COI POI 0 34.29 476 12711 1 35.71 95 7782 2 37.14 0 7844 3 38.57 0 3640 4 40.00 93 6010 5 41.43 0 5621 6 42.86 1245 18146 7 44.29 116 6844 8 45.71 140 7099 9 47.14 500 483 10 48.57 445 3956 11 50.00 1540 22362 12 51.43 152 6366 13 52.86 131 8354 14 54.29 810 7542 15 55.71 132 9337 16 57.14 12455 15024 17 58.57 662 5245 18 60.00 1743 9116 19 61.43 1368 7236 20 62.86 1128 11890 21 64.29 4537 24204 22 65.71 766 5113 23 67.14 1859 10572 24 68.57 12407 11367 25 70.00 13263 11748 26 71.43 23400 31566 27 72.86 2784 12984 28 74.29 12679 20520 29 75.71 6932 14617 .. ... ... ... 63 115.00 39738 18033 64 115.71 5293 2877 65 116.43 1874 2748 66 117.14 4181 1965 67 117.86 3618 4214 68 118.57 11652 4043 69 120.00 81523 34752 70 121.43 14239 3527 71 122.86 9046 6160 72 125.00 187 88 73 125.71 22557 7381 74 128.57 11053 8163 75 130.00 74007 27825 76 131.43 6747 1951 77 132.86 7289 1383 78 134.29 5872 1380 79 135.71 4946 2047 80 137.14 5349 590 81 140.00 98310 57767 82 145.00 9857 403 83 150.00 64701 2063 84 155.00 17398 1434 85 160.00 12363 1133 86 165.00 5222 539 87 170.00 9050 918 88 175.00 9848 678 89 180.00 3408 85 90 185.00 3243 768 91 190.00 3646 419 92 195.00 4789 149 Since I want the Strikes to be the bin, I have tried to plot a histogram by saying: df.hist(by=df.Strike) but I either get nothing, or when I do see the system ready to plot with a bunch of little grids (I am using Spyder) I get this error before any plot. As far as I can see, all the dataframes have at least one point. The y-axis also doesn't make sense since its height appears to always be one: Traceback (most recent call last): File "<ipython-input-20-6f27fa6cf56c>", line 1, in <module> runfile('/home/idf/goog.py', wdir='/home/idf') File "/home/idf/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 682, in runfile execfile(filename, namespace) File "/home/idf/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 78, in execfile builtins.execfile(filename, *where) File "/home/idf/goog.py", line 153, in <module> df.hist(by=df.Strike) File "/home/idf/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.py", line 2740, in hist_frame **kwds) File "/home/idf/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.py", line 2873, in grouped_hist figsize=figsize, layout=layout, rot=rot) File "/home/idf/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.py", line 2983, in _grouped_plot plotf(group, ax, **kwargs) File "/home/idf/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.py", line 2867, in plot_group ax.hist(group.dropna().values, bins=bins, **kwargs) File "/home/idf/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 5597, in hist raise ValueError("x must have at least one data point") ValueError: x must have at least one data point

  • Answer:

    When you call DataFrame.hist method (i.e. pandas internal plotting function) you only need to pass a column name: df.hist('Strike') # which is the same as df.hist(column='Strike') To get: If you would use plt.hist (directly accessing matplotlib function) then you would need to pass df.Strike.values.

Ivan at Stack Overflow Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.