pandas_genomics.arrays.GenotypeArray¶
-
class
pandas_genomics.arrays.
GenotypeArray
(values: Union[List[pandas_genomics.scalars.Genotype], pandas_genomics.arrays.genotype_array.GenotypeArray, numpy.ndarray], dtype: Optional[pandas_genomics.arrays.genotype_array.GenotypeDtype] = None, copy: bool = False)[source]¶ Holder for genotypes
Variant information is stored as part of the type, and the genotype is stored as a pair of integer arrays
- Parameters
- valueslist-like
The values of the genotypes.
- dtypeGenotypeDtype
The specific parametized type. Optional (if possible to infer from values)
- Attributes
- dtype: GenotypeDtype
The specific parametized type
- data: np.dtype(“u8”) with shape (<genotypes>, <ploidy>)
The genotype values encoded as indices into the allele list of the dtype
-
__init__
(values: Union[List[pandas_genomics.scalars.Genotype], pandas_genomics.arrays.genotype_array.GenotypeArray, numpy.ndarray], dtype: Optional[pandas_genomics.arrays.genotype_array.GenotypeDtype] = None, copy: bool = False)[source]¶ Initialize assuming values is a GenotypeArray or a numpy array with the correct underlying shape
Methods
__init__
(values[, dtype, copy])Initialize assuming values is a GenotypeArray or a numpy array with the correct underlying shape
argmax
([skipna])Return the index of maximum value.
argmin
([skipna])Return the index of minimum value.
argsort
([ascending, kind, na_position])Return the indices that would sort this array.
astype
(dtype[, copy])Cast to a NumPy array with ‘dtype’.
copy
()Return a copy of the array.
delete
(loc)dropna
()Return ExtensionArray without NA values.
encode_additive
()Additive Encoding
encode_codominant
()This encodes the genotype into three categories.
encode_dominant
()Dominant Encoding
encode_edge
(alpha_value, ref_allele, …)Perform EDGE (weighted) encoding.
encode_recessive
()Recessive Encoding
equals
(other)Return if another array is equivalent to this array.
factorize
([na_sentinel])Return an array of ints indexing unique values
fillna
([value, method, limit])Fill NA/NaN values using the specified method.
is_genotype_array
(other)isin
(values)Pointwise comparison for set containment in the given values.
isna
()A 1-D array indicating if each value is missing
ravel
([order])Return a flattened view on this array.
repeat
(repeats[, axis])Repeat elements of a ExtensionArray.
searchsorted
(value[, side, sorter])Find indices where elements should be inserted to maintain order.
set_reference
(allele)Change the reference allele (in-place) by specifying an allele index value or an allele string
shift
([periods, fill_value])Shift values by desired number.
take
(indexer[, allow_fill, fill_value])Take elements from an array.
to_numpy
([dtype, copy, na_value])Convert to a NumPy ndarray.
transpose
(*axes)Return a transposed view on this array.
unique
()Return a GenotypeArray of unique values
value_counts
([dropna])Return a Series of unique counts with a GenotypeArray index
view
([dtype])Return a view on the array.
Attributes
T
allele_idxs
Return the allele indices for each genotype
dtype
The specific parametized type
gt_scores
Return the genotype score for each genotype (as a float)
hwe_pval
Calculate the probability that the samples are in HWE for diploid variants
is_heterozygous
Boolean array: True if the sample is heterozygous for any alleles
is_homozygous
Boolean array: True if the sample is homozygous for any allele
is_homozygous_alt
Boolean array: True if the sample is homozygous for any non-reference allele
is_homozygous_ref
Boolean array: True if the sample is homozygous for the reference allele
is_missing
Boolean array: True if the sample is missing all alleles
maf
Calculate the Minor Allele Frequency (MAF) for the most-frequent alternate allele.
nbytes
How many bytes to store this object in memory
ndim
Extension Arrays are only allowed to be 1-dimensional.
shape
Return a tuple of the array dimensions.
size
The number of elements in the array.
variant
Return the variant identifier