TheJoker#

class thejoker.TheJoker(prior, pool=None, rng=None, tempfile_path=None)#

Bases: object

A custom Monte-Carlo sampler for two-body systems.

Parameters:
  • prior (~thejoker.JokerPrior) – The specification of the prior probability distribution over all parameters used in The Joker.

  • pool (schwimmbad.BasePool (optional)) – A processing pool (default is a schwimmbad.SerialPool instance).

  • rng (numpy.random.Generator (optional)) – A numpy.random.Generator instance for controlling random number generation.

  • tempfile_path (str (optional)) – A location on disk where The Joker may store some temporary files. Any files written here by The Joker should be cleaned up: If any files in this path persist, something must have gone wrong within The Joker. Default: ~/.thejoker

Attributes Summary

tempfile_path

Methods Summary

iterative_rejection_sample(data, ...[, ...])

This is an experimental sampling method that adaptively generates posterior samples given a large library of prior samples.

marginal_ln_likelihood(data, prior_samples)

Compute the marginal log-likelihood at each of the input prior samples.

rejection_sample(data, prior_samples[, ...])

Run The Joker's rejection sampling on prior samples to get posterior samples for the input data.

setup_mcmc(data, joker_samples[, model, ...])

Setup the model to run MCMC using pymc.

Attributes Documentation

tempfile_path#

Methods Documentation

iterative_rejection_sample(data, prior_samples, n_requested_samples, max_prior_samples=None, n_linear_samples=1, return_logprobs=False, n_batches=None, randomize_prior_order=False, init_batch_size=None, growth_factor=128, in_memory=False)#

This is an experimental sampling method that adaptively generates posterior samples given a large library of prior samples. The advantage of this function over the standard rejection_sample method is that it will try to adaptively figure out how many prior samples it needs to evaluate the likelihood at in order to return the desired number of posterior samples.

Parameters:
  • data (~thejoker.RVData) – The radial velocity data, or an iterable containing RVData objects for each data source.

  • prior_samples (str, ~thejoker.JokerSamples) – Either a path to a file containing prior samples generated from The Joker, or a ~thejoker.JokerSamples instance containing the prior samples.

  • n_requested_samples (int (optional)) – The number of posterior samples desired.

  • max_prior_samples (int (optional)) – The maximum number of prior samples to process.

  • n_linear_samples (int (optional)) – The number of linear parameter samples to generate for each nonlinear parameter sample returned from the rejection sampling step.

  • return_logprobs (bool (optional)) – Also return the log-prior and (marginal) log-likelihood values evaluated at each sample.

  • n_batches (int (optional)) – The number of batches to split the prior samples into before distributing for computation. If using the (default) serial computation pool, this doesn’t have any impact. If using multiprocessing or MPI, this determines how many batches to split the samples into before scattering over all workers.

  • randomize_prior_order (bool (optional)) – Randomly shuffle the prior samples before reading and running the rejection sampler. This is only useful if you are using a large library of prior samples, and choosing to run on a subset of those samples.

  • init_batch_size (int (optional)) – The initial batch size of likelihoods to compute, before growing the batches using the multiplicative growth factor, below.

  • growth_factor (int (optional)) – A factor used to adaptively grow the number of prior samples to evaluate on. Larger numbers make the trial batches grow faster.

  • in_memory (bool (optional)) – Load all prior samples or keep all prior samples in memory and run all calculations without creating a temporary cache file.

Returns:

samples – The posterior samples produced from The Joker.

Return type:

~thejoker.JokerSamples

marginal_ln_likelihood(data, prior_samples, n_batches=None, in_memory=False)#

Compute the marginal log-likelihood at each of the input prior samples.

Parameters:
  • data (thejoker.RVData, iterable, dict) – The radial velocity data, or an iterable containing RVData objects for each data source.

  • prior_samples (str, thejoker.JokerSamples) – Either a path to a file containing prior samples generated from The Joker, or a ~thejoker.JokerSamples instance containing the prior samples.

  • n_batches (int (optional)) – The number of batches to split the prior samples into before distributing for computation. If using the (default) serial computation pool, this doesn’t have any impact. If using multiprocessing or MPI, this determines how many batches to split the samples into before scattering over all workers.

  • in_memory (bool (optional)) – Load all prior samples or keep all prior samples in memory and run all calculations without creating a temporary cache file.

Returns:

ln_likelihood – The marginal log-likelihood computed at the location of each prior sample.

Return type:

numpy.ndarray

rejection_sample(data, prior_samples, n_prior_samples=None, max_posterior_samples=None, n_linear_samples=1, return_logprobs=False, return_all_logprobs=False, n_batches=None, randomize_prior_order=False, in_memory=False)#

Run The Joker’s rejection sampling on prior samples to get posterior samples for the input data.

You must either specify the number of prior samples to generate and use for rejection sampling, n_prior_samples, or the path to a file containing prior samples, prior_cache_file.

Parameters:
  • data (~thejoker.RVData) – The radial velocity data, or an iterable containing RVData objects for each data source.

  • prior_samples (str, ~thejoker.JokerSamples, int) – Either a path to a file containing prior samples generated from The Joker, or a ~thejoker.JokerSamples instance containing the prior samples.

  • n_prior_samples (int (optional)) – The number of prior samples to run on. This is only used if passing in a string filename: If the file contains a large number of prior samples, you may want to set this to only run on a subset.

  • max_posterior_samples (int (optional)) – The maximum number of posterior samples to generate. If using a large library of prior samples, and running on uninformative data, you may want to set this to a small but reasonable number (like, 256).

  • n_linear_samples (int (optional)) – The number of linear parameter samples to generate for each nonlinear parameter sample returned from the rejection sampling step.

  • return_logprobs (bool (optional)) – Also return the log-prior and (marginal) log-likelihood values evaluated at each sample.

  • return_all_logprobs (bool (optional)) – This will return the marginal log-likelihood values at every prior sample used in this sampling! Use at your own (memory) risk.

  • n_batches (int (optional)) – The number of batches to split the prior samples into before distributing for computation. If using the (default) serial computation pool, this doesn’t have any impact. If using multiprocessing or MPI, this determines how many batches to split the samples into before scattering over all workers.

  • randomize_prior_order (bool (optional)) – Randomly shuffle the prior samples before reading and running the rejection sampler. This is only useful if you are using a large library of prior samples, and choosing to run on a subset of those samples.

  • in_memory (bool (optional)) – Load all prior samples or keep all prior samples in memory and run all calculations without creating a temporary cache file.

Returns:

samples – The posterior samples produced from The Joker.

Return type:

~thejoker.JokerSamples

setup_mcmc(data, joker_samples, model=None, custom_func=None)#

Setup the model to run MCMC using pymc.

Parameters:
  • data (~thejoker.RVData) – The radial velocity data, or an iterable containing RVData objects for each data source.

  • joker_samples (~thejoker.JokerSamples) – If a single sample is passed in, this is packed into a pymc initialization dictionary and returned after setting up. If multiple samples are passed in, the median (along period) sample is taken and returned after setting up for MCMC.

  • model (pymc.Model) – This is either required, or this function must be called within a pymc model context.

  • custom_func (callable (optional)) –

Returns:

mcmc_init

Return type:

dict