document

class anadama2.document.Document[source]

A document that is auto generated from a template.

Parameters:
  • templates (str or list) – The document template files (or file)
  • depends (list of anadama2.tracked.Base or strings) – The list of dependencies.
  • targets (anadama2.tracked.Base or string) – The target(s). The document(s) to be generated.
  • vars (dict) – A dictionary of variables used by the template.
class anadama2.document.PweaveDocument(templates=None, depends=None, targets=None, vars=None, table_of_contents=None)[source]

A document that is auto generated from a template using Pweave and Pandoc

Parameters:
  • templates (str or list) – The document template files (or file)
  • depends (list of anadama2.tracked.Base or strings) – The list of dependencies.
  • targets (anadama2.tracked.Base or string) – The target(s). The document(s) to be generated.
  • vars (dict) – A dictionary of variables used by the template.
  • table_of_contents (bool) – If set add table of contents to reports
compute_pcoa(sample_names, feature_names, data, apply_transform)[source]

Use the vegan package in R to compute a PCoA. Input data should be organized with samples as columns and features as rows. Data should be scaled to [0-1] if transform is to be applied.

Parameters:
  • sample_names (list) – The labels for the columns
  • feature_names (list) – The labels for the data rows
  • data (list) – A list of lists containing the data
  • apply_transform (bool) – Arcsin transform to be applied
create(task)[source]

Create the documents specified as targets

filter_zero_columns(column_names, data)[source]

Filter the columns from the data set that sum to zero

Parameters:
  • column_names (list) – The names of the columns
  • data (list) – A list of lists containing the data
filter_zero_rows(row_names, data)[source]

Filter the rows from the data set that sum to zero

Parameters:
  • row_names (list) – The names of the rows
  • data (list) – A list of lists containing the data
get_vars()[source]

Try to get the variables from the pickled file

plot_barchart(data, labels=None, title=None, xlabel=None, ylabel=None)[source]

Plot a barchart

Parameters:
  • data (list) – A list of lists containing the data
  • labels (list) – The labels for the data rows
  • title (str) – The title for the plot
  • xlabel (str) – The x-axis label
  • ylabel (str) – The y-axis label
plot_grouped_barchart(data, row_labels, column_labels, title, xlabel=None, ylabel=None, legend_title=None, yaxis_in_millions=None)[source]

Plot a grouped barchart

Parameters:
  • data (list) – A list of lists containing the data
  • row_labels (list) – The labels for the data rows
  • column_labels (list) – The labels for the columns
  • title (str) – The title for the plot
  • xlabel (str) – The x-axis label
  • ylabel (str) – The y-axis label
  • legend_title (str) – The title for the legend
  • yaxis_in_millions (bool) – Show the y-axis in millions
plot_scatter(data, title, row_labels, xlabel=None, ylabel=None, trendline=None)[source]

Plot a scatter plot

Parameters:
  • data (list) – A list of lists containing the data
  • title (str) – The title for the plot
  • row_labels (list) – The labels for the data rows
  • xlabel (str) – The x-axis label
  • ylabel (str) – The y-axis label
  • trendline (bool) – Add a trendline to the plot
plot_stacked_barchart(data, row_labels, column_labels, title, xlabel=None, ylabel=None, legend_title=None, legend_style='normal', legend_size=7)[source]

Plot a stacked barchart

Parameters:
  • data (list) – A list of lists containing the data
  • row_labels (list) – The labels for the data rows
  • column_labels (list) – The labels for the columns
  • title (str) – The title for the plot
  • xlabel (str) – The x-axis label
  • ylabel (str) – The y-axis label
  • legend_title (str) – The title for the legend
  • legend_style (str) – The font style for the legend
  • legend_size (int) – The font size for the legend
plot_stacked_barchart_grouped(grouped_data, row_labels, column_labels_grouped, title, ylabel=None, legend_title=None, legend_style='normal', legend=True, legend_size=7)[source]

Plot a stacked barchart with data grouped into subplots

Parameters:
  • grouped_data – A dict of lists containing the grouped data
  • row_labels (list) – The labels for the data rows
  • column_labels_grouped – The labels for the columns grouped
  • title (str) – The title for the plot
  • ylabel (str) – The y-axis label
  • legend_title (str) – The title for the legend
  • legend_style (str) – The font style for the legend
  • legend (bool) – Display legend
  • legend_size (int) – The font size for the legend
read_table(file, invert=None, delimiter='\t', only_data_columns=None, format_data=None)[source]

Read the table from a text file with the first line the column names and the first column the row names.

Parameters:
  • file (str) – The file to read
  • invert (bool) – Invert the table rows/columns after reading
  • delimiter (str) – The delimiter present in the file
  • only_data_columns (bool) – Remove the header and row names
  • format_data (function) – A function to use to format the data
show_hclust2(sample_names, feature_names, data, title, log_scale=True, zscore=False, metadata_rows=None, method='correlation')[source]

Create a hclust2 heatmap with dendrogram and show it in the document

Parameters:
  • sample_names (list) – The names of the samples
  • feature_names (list) – The names of the features
  • data (list) – A list of lists containing the data
  • title (str) – The title for the plot
  • log_scale (bool) – Show the heatmap with the log scale
  • zscore (bool) – Apply the zscore to the data prior to clustering
  • metadata_rows (list) – A list of metadata rows
  • method (str) – The distance function for features
show_pcoa(sample_names, feature_names, data, title, sample_types='samples', feature_types='species', metadata=None, apply_transform=False, sort_function=None, metadata_type=None)[source]

Use the vegan package in R plus matplotlib to plot a PCoA. Input data should be organized with samples as columns and features as rows. Data should be scaled to [0-1] if transform is to be applied.

Parameters:
  • sample_names (list) – The labels for the columns
  • feature_names (list) – The labels for the data rows
  • data (list) – A list of lists containing the data
  • title (str) – The title for the plot
  • sample_types (str) – What type of data are the columns
  • feature_types (str) – What type of data are the rows
  • metadata (dict) – Metadata for each sample
  • metadata_type (str) – Type of metadata (continuous or categorical)
  • apply_transform (bool) – Arcsin transform to be applied
  • sort_function (lambda) – The function to sort the plot data
show_pcoa_multiple_plots(sample_names, feature_names, data, title, abundances, legend_title='% Abundance', sample_types='samples', feature_types='species', apply_transform=False)[source]

Use the vegan package in R plus matplotlib to plot a PCoA. Input data should be organized with samples as columns and features as rows. Data should be scaled to [0-1] if transform is to be applied. Show multiple PCoA plots as subplots each with coloring based on abundance.

Parameters:
  • sample_names (list) – The labels for the columns
  • feature_names (list) – The labels for the data rows
  • data (list) – A list of lists containing the data
  • title (str) – The title for the plot
  • abundances (dict) – The sets of abundance data and names for the subplots
  • legend_title (str) – The title for the legend
  • sample_types (str) – What type of data are the columns
  • feature_types (str) – What type of data are the rows
  • apply_transform (bool) – Arcsin transform to be applied
show_table(data, row_labels, column_labels, title, format_data_comma=None, location='center', font=None)[source]

Plot the data as a table

Parameters:
  • data (list) – A list of lists containing the data
  • row_labels (list) – The labels for the data rows
  • column_labels (list) – The labels for the columns
  • title (str) – The title for the plot
  • format_data_comma (bool) – Format the data as comma delimited
  • location (str) – The location for the text in the cell
  • font (int) – The size of the font
sorted_data_numerical_or_alphabetical(data)[source]

Sort the data numerically or alphabetically depending on data type

write_table(column_labels, row_labels, data, file)[source]

Write a table of data to a file

Parameters:
  • column_labels (list) – The labels for the columns
  • row_labels (list) – The labels for the data rows
  • data (list) – A list of lists containing the data
  • file (str) – The file to write the table to