# Build Code
This document provides a comprehensive reference for understanding and configuring builds for the PetroAI pipeline. Suggested values and ranges are also included.
# MySQL Data Tables
GridAttributeData
GridAttributeHeader
GridStructureData
GridStructureHeader
InventoryWells
MicroseismicEvent
MonthlyProduction
ReservoirStressWellLogRecord
StressOrientationMeasure
Well
WellDirectionalSurveyPoint
WellExtra
WellLookup
These data tables are the sources for any input data used in the PetroAI pipeline.
# Products
products:
- core1
- core2
- core3
- raw
- diag
- grid
- inv
Defines the set of modules enabled for outputs to be generated 'products'.
core1
- PDP forecast parameters and volumes. Linear regression modeling is used to fit historical production for forecasts.core2
- PDP feature attribution and analysis. Geologic grids are sampled to each well and feature correlations are calculated to suppot training feature optimization.core3
- Predictions for oil and gas production based on chosen input features. Regression Tree modeling quantifies the impact of each feature and uses analogous groupings to predict oil and gas production.raw
- Pipeline for how wells are grouped for the decline curve analysis.diag
- Diagnostic data for quantifying model quality and reliability. Quantifies model predictions against actuals and quantifies feature importance and impact using shapley analysis.grid
- Predictive forecasts of undeveloped wells using a generic grid. Multiple scenarios may be defined for differing engineering designs and geologic attribution is based on provided input grids.inv
- Discrete predictive forecasts of undeveloped wells loaded into theInventoryWells
table. Explicitly defined well locations, designs, and timing are ingested to generate forecasts based on the given inputs and features within the model.
# Phases Configuration
# Shared Configurations
Suggested defaults provided below:
phases:
shared:
midas_project_options:
batch_size_drain: 3
batch_size_features: 100
batch_size_frac: 10
batch_size_plot: 3
batch_num_tiles_earth: 1
buffer_ft: 4000
crs_proj4: +proj=utm +zone=13 +datum=NAD27 +units=m +no_defs
generate_stage_method: fixed
fixed_stage_count: 1
frac_height_ft: 3000
frac_width_ft: 3000
frac_value_high: 1
frac_value_low: 0.05
max_drainage_distance_ft: 1500
max_frac_distance_ft: 1500
max_frac_horiz_distance_ft: 1500
max_frac_vertical_distance_ft: 1500
max_parallelism: 10
min_lateral_overlap_ft: 1000
number_threads_drain: 1
number_threads_earth: 10
number_threads_feature: 5
number_threads_frac: 10
number_threads_plot: 5
sibling_days: 180
stage_length_ft: 1000
use_existing_stages: false
use_frac_penny: true
use_ortho_stress: false
run_drainage: true
create_vertical_well_segments: false
max_offset_direction_difference_deg: 180
Defaults applied globally unless overridden in sub-phases.
# Midas Project
Midas project is the named module for the earth model pipeline.
crs_proj4
- Explicitly define the coordinate reference system to be used for any geo-spatial analysisgenerate_stage_method
- How the horizontal section is discretized, either using a fixed number of segments (e.g. 3), or using pre-loaded stage depths (unusual). Should normally befixed
.fixed_stage_count
- Will use the specified number of stages to segment the lateral length for spacing, landing, and drainage calculations.
Drainage vs Fracture: Drainage indicates the total volume from which a well may be producing, including matrix contribution. Fracture indicates a maximum theoretical distance where two wells may be in communication.
frac_height_ft
- The total height for any modeled fracture growth. Suggested ranges vary depending on basin, faults, and depositional geomechanics [100, 3000]. This setting is only applied whenuse_frac_penny
is enabled.frac_width_ft
- The total width for any modeled fracture growth. Suggested ranges vary depending on basin, faults, and depositional geomechanics [100, 5000]. This setting is only applied whenuse_frac_penny
is enabled.frac_value_high
andfrac_value_low
- control the amount of co-stimulation that is allowed. Setting the low value closer to 1 reduces the amount of co-stimulation.max_drainage_distance_ft
- The maximum distance for any modeled stimulated rock volume drainage. Suggested ranges vary depending on basin, well designs, and depositional geomechanics [100, 3000].max_frac_distance_ft
- The maximum distance in any directional vector for any fracture dimension. Suggested ranges vary depending on basin, well designs, and depositional geomechanics [100, 5000].max_frac_horiz_distance_ft
- The maximum horizontal distance a fracture will be modeled. Suggested ranges vary depending on basin, well designs, and depositional geomechanics [100, 3000].max_frac_vertical_distance_ft
- The maximum vertical distance a facture will be modeled. Suggested ranges vary depending on basin, well designs, and depositional geomechanics [100, 3000].sibling_days
- The threshold number of days an exiting well to be producing before the subject well to either be considered co-developed or a parent-infill relation. For example, ifsibling_days = 180
and if Well A has been producing for 200 days before Well B, then Well A will be a parent and Well B will be an infill well. If Well has has been producing for 120 days before Well B, then both Well A and Well B are co-developed. Suggested ranges vary depending on basin operational strategies [60, 270].stage_length_ft
- Sets the dimension for the length of fracturing stages to be used in geomechanical modeling. Does not apply when generate_stage_method =fixed
.use_existing_stages
- If stage lengths are provided in input data then setting totrue
will use the provided data instead of the above default instage_length_ft
. Does not apply when generate_stage_method =fixed
. Should usually be set tofalse
.use_frac_penny
- When enabled it will model 3D fractures as an ellipse with the provided dimensions. If false will use the earth model set in the geo phase withgeo: frac_geometry_model_file
.use_ortho_stress
- When enabled, spacing and drainage calculations will be performed orthogonal to the lateral of the wellbore as opposed to along the orientation of the maximum horizontal stress (which is the default behavior).run_drainage
- KEY FEATURE. When enabled it will use PetroAI's proprietary geomechanical earth model to derive each well's total drainage area. This is a fundamental feature for capturing the effects of well spacing and parent-infill interactions.create_vertical_well_segments
- When enabled, will process both vertical and horizontal wells to calculate penetrations depths through the structure grids.max_offset_direction_difference_deg
- Sets the threshold angle for offset well orientation to be considered a neighbor when performing geometric spacing calculations.
# Geo Phase
phases:
geo:
frac_geometry_model_file: apollo_1.json
frac_geometry_model_file
- Specifies the JSON configuration file that contains the geomechanical model to be used in the build.
# Well Phase
phases:
well:
interval_alias_mapping:
WOLFCAMP_A : WCA
WOLFCAMP_B : WCB
...
skip_well_extras: false
Set configurations specific for well data and features.
interval_alias_mapping
- Each row provides an alias for any named values ininterval
.skip_well_extras
- If enabled, then theWellExtra
table will not be ingested
# Features Phase
# Forecasting
phases:
features:
forecasting:
wells_per_scenario: 750
ignore_wells_without_forecast_summary: true
early_life_well_forecast_options:
max_radius_in_meters: 8000.0
forecast_options:
arps:
qi: [1.1, 1.3]
b: [0.5, 1.2]
de: [0.5, 0.99]
dmin: [0.06, 0.06]
normalize:
startMode: peakRate
peakFluid: auto
gorThreshold: 10
eol:
enabled: false
forecast:
years: 40
frequency: monthly
minProductionCount: 3
Defines how forecast scenarios are configured, including Arps parameters and normalization rules.
wells_per_scenario
- Sets the number of wells to group in a batch for forecasting for efficiency and grouping analogs.ignore_wells_without_forecast_summary
- If enabled, will skip any wells that did not pass an earlier pipeline processing production data for forecasting.
# Early Life Well Forecast Options
max_radius_in_meters
- When a well has been producing for fewer months than specified inminProductionCount
, this parameter sets the radius of investigation for referencing analogous wells for setting forecast parameters.
# Forecast Options
# Arps - Multi-segment Hyperbolic Parameterization
Reference: https://www.phdwin.com/wp-content/uploads/2017/05/About-Arps-Equations.pdf
qi
- The ranges of initial starting rates for best fitting a curve to production data. Units are ratios to peak production. Suggested ranges: [0.8, 1.3].b
- The ranges for fitting the b-factor in the hyperbolic segment. Units are a dimensionless slope of log-rate and log-time unit slopes. Suggested ranges: [0.5, 1.2].de
- The ranges for fitting the secant effective instantaneous decline rate. Units are a fraction that must be less than 1. Suggested ranges: [0.5, 0.99].dmin
- The ranges for appending the exponential decline segment of a forecast. Units are a fraction that must be less than 1. Suggested ranges: [0.05, 0.12]. Set upper and lower bound to same number to assert a fixeddmin
.
# Normalization of production data for fitting
startMode
- Setting the starting point for fitting the production profile. Recommendation is to usepeakRate
for optimized curve fitting and to apply "ramp up" segments after forecast generation. Can be: start, peakRate, manual, or localPeakRate.peakFluid
- Explicitly define or allow the system to determin the best fluid for setting the starting point in the time-series for fitting forecast parameters. Recommendation is to useauto
. Can be: oil, gas, or auto.gorThreshold
- Set a maximum limit for the gas-oil-ratio while fitting the different stream parameters. Units are mcf/bbl. Suggested ranges: [6, 20]eol:
- "End of life"enabled
- When enabled withtrue
will fit the forecast to only the last X years of production data when the well has been producing for Y years.yearsOn
- Ifeol
is enabled, it only applies to wells that have been producing for 'x' or more years, defined by this inputyearsEnd
- Ifeol
is enabled, the decline curve will be fit through the last 'x' number of years, defined by this input
# Forecast generation settings
years
- Sets the total number of years for the well forecast which limits the EUR and generates production volumes.frequency
- Sets the segments for generating forecast volumes. Options: yearly, monthly, and daily. Warning: daily will significantly increase compute time.minProductionCount
- Sets the minimum number of production records required to generate a forecast.
# Feature Building
phases:
features:
feature_building:
num_processing_jobs: 60
make_plots: false
Controls parallelization and optional plotting of feature sets.
num_processing_jobs
- Defines how many parallel compute machines will be utilized. It is recommended to start with 4.make_plots
- If enabled generates gunbarrel images for evaluating and visualizing well spacing and drainage. Warning: if enabled will increase compute time significantly.
# Model Phase
For more details on the phases
> model
configuration and best practices, visit Model Configuration.
phases:
model:
model_configs:
bundle1:
training_filter: ...
evaluation_filter: ...
model_features:
- lateralLength
- totalDrainage
...
bundle2:
Model bundles may be explicitly defined for different subset grouping of wells. Different feature sets may be used between different model bundles; however, it is recommended to maintain the same features across model bundles for reliable and meaningful interpretation and comparisons. Each model may be named by the user.
# Training Filter
The training_filter
configuration passes a list of filters to select a subset of the data for training the machine learning models. It is recommended to use these filters to remove any erroneous data from the model. The structure is a long string that names variables and uses logic operands to define dictionary lists or values for filtering. The named variables should be fields from the CORE_well_feature tables.
For example:
model:
model_configs:
bundle1:
training_filter: "interval in ['zone1', 'zone2', 'zone3'] & completionYear >=2010 & lateralLength > 3000"
# Evaluation Filter
The evaluation_filter
should be similar as all training_filter
operands except the production data filtering. A variation example may be to remove the min production filter so that predictions are made for early-life wells.
# Model Features
Provide a list of features from the CORE_well_features
table to be used for the model training.
# Product Phase
# Raw
product:
raw:
dca_fs_batch_size: 100
prod_batch_size: 250
Parameters to optimize downloading and publish PDP & DCA time series data.
# Core
product:
core:
forecast:
dca_included_well_fields_to_types:
wellId: "str"
...
model:
num_years_to_forecast: 40
Specifies included fields for DCA and how long PDP models forecast into the future.
# Inventory (inv)
inv:
include_pdps: true
max_nearby_pdp_distance_miles: 3
valid_pdps_group_name : ""
num_years_to_forecast: 40
include_timeseries_monthly: true
make_plots: true
inventory_options:
crs_proj4: "+proj=utm +zone=13 +datum=NAD27 +units=m +no_defs"
midas_project_options: null
partition_options: null
well_features_transformations:
# one or more of these
- type: add_uniform_value_column
column_name: totalProppantByPerfLength
column_dtype: float
column_value: 2500
overwrite: true
- type: add_uniform_value_column
column_name: totalFluidByPerfLength
column_dtype: float
column_value: 2200
overwrite: true
sensitivity_features:
sampled_features:
- feature: totalProppantByPerfLength
low: 1500
high: 3000
step: 500
linked_features:
- feature: totalFluidByPerfLength
value_expr: "row['totalProppantByPerfLength']"
Inventory prediction controls including batch size, transformations, and sensitivity analysis.
# Grid
grid:
num_years_to_forecast: 40
grid_spec:
workGroupPrefix: 4WPS
scenarios:
- name: 4WPS_04_BS2S
wells:
- name: 04_BS2S_w1
lateralLength:
value: 10000
Defines the layout for grid-based simulations including type curves and spacing.
# PDP2
pdp2:
sensitivity_features:
sampled_features:
- feature: lateralLength
low: 7500
high: 15000
Controls for predicting PDP wells with different engineering parameters (e.g. simulating a different frac size).
# Additional Notes
midas_project_options
are overridable at most levelswell_features_transformations
ensure completeness of feature datasensitivity_features
allow for parameter sweeps (e.g. predict at proppant ranging from 1500-2000 lb/ft at 500 lb/ft increments)