![]() ![]() Price = Ĭolor =ĭf = pd.DataFrame(dict(carat=carat, price=price, color=color))ĭf.set_index().unstack('color').plot(style='o') This takes the index as the x value, the value as the y value and plots each column separately with a different color.Ī DataFrame in this form can be achieved by using set_index and unstack. Normally when quickly plotting a DataFrame, I use pd.ot(). In the first plot, the default colors are chosen by passing min-max scaled values from the array of category level ints pd.factorize(iris) to the call method of the plt.cm.viridis colormap object. I chose the "tab10" discrete (aka qualitative) colormap here, which does a better job at signaling the color factor is a nominal categorical variable. Plt.legend(handles=handles, title='Color') Levels, categories = pd.factorize(df)Ĭolors = # using the "tab10" colormap To choose your own colormap and add a legend, the simplest approach is this: import matplotlib.patches In this case "viridis" is not a good default choice because the colors appear to imply a sequential order rather than purely nominal categories. This creates a plot without a legend, using the default "viridis" colormap. Plt.gca().set(xlabel='Carat', ylabel='Price', title='Carat vs. ![]() The easiest way is to simply pass an array of integer category levels to the plt.scatter() color parameter. To select a color, I've created a colors dictionary, which can map the diamond color (for instance D) to a real color (for instance tab:blue). It then iterates over these groups, plotting for each one. This code assumes the same DataFrame as above, and then groups it based on color. ![]() ot(ax=ax, kind='scatter', x='carat', y='price', label=key, color=colors) If you don't want to use seaborn, use oupby to get the colors alone, and then plot them using just matplotlib, but you'll have to manually assign colors as you go, I've added an example below: fig, ax = plt.subplots(figsize=(6, 6)) sns.lmplot(x='carat', y='price', data=df, hue='color', fit_reg=False) Selecting hue='color' tells seaborn to split and plot the data based on the unique values in the 'color' column. sns.scatterplot(x='carat', y='price', data=df, hue='color', ec=None) also does the same thing.You can use seaborn which is a wrapper around matplotlib that makes it look prettier by default (rather opinion-based, I know :P) but also adds some plotting functions.įor this you could use seaborn.lmplot with fit_reg=False (which prevents it from automatically doing some regression). (Forgive me for not putting another example image up, I think 2 is enough :P) With seaborn Handles =, , marker='o', color='w', markerfacecolor=v, label=k, markersize=8) for k, v in ems()]Īx.legend(title='color', handles=handles, bbox_to_anchor=(1.05, 1), loc='upper left')ĭf.map(colors) effectively maps the colors from "diamond" to "plotting". fig, ax = plt.subplots(figsize=(6, 6))Ĭolors = Īx.scatter(df, df, c=df.map(colors)) The following code defines a colors dictionary to map the diamond colors to the plotting colors. You can pass plt.scatter a c argument, which allows you to select the colors. Id <- sample(LETTERS,100, replace = TRUE)Ĭat <- sample(letters,100, replace = TRUE)Ĭat2 <- sample(letters,100, replace = TRUE)ĭf <- data.Imports and Sample DataFrame import matplotlib.pyplot as pltįrom matplotlib.lines import Line2D # for legend handleĬarat cut color clarity depth table price x y zĠ 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43ġ 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31Ģ 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31 Ggplot(aes(x = Var, y = Val, fill = cat)) + Pivot_longer(., cols = c(num1,num2), names_to = "Var", values_to = "Val") %>% Pivot_longer(., cols = c(num1,num2), names_to = "Var", values_to = "Val")įinally, you can add to this pipe sequence the plotting part by calling ggplot and geom_boxplot : library(tidyr) Starting with this dataframe: id num1 num2 num3 cat cat2īasically, you are selecting first your columns of interest (here num1, num2 and cat), then, you reshape data into a longer format using pivot_longer function to obtain something like that: library(tidyr) If so, you can have the use of dplyr, tidyr and ggplot2 packages to achieve this. If I understand right your question, you are looking to plot selected numerical columns against a selected categorical column of your dataset, am I right ?
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |