The goal of the project is to develop a modeling and optimization application,
PARETO, that can help organizations better manage, better treat, and – where possible – beneficially reuse produced water from oil and gas operations.
Specifically, PARETO will help decision-makers with:
buildout of the produced water infrastructure
management of produced water volumes
selection of effective treatment technologies
placement & sizing of treatment facilities
identification of beneficial water reuse options
distribution of treated produced water for reuse
The initiative is committed to viewing produced water management from a “systems” perspective and to building an inclusive framework that will unite stakeholders from across the produced water community. The vision is that PARETO will not only help oil & gas but also allow other industries (e.g., agriculture, mining) explore beneficial reuse opportunities for treated produced water. Figure 1 (below) illustrates the scope of “Project PARETO”.
Project PARETO is a 3-year initiative that has been split into three distinct phases; with each phase taking up exactly one year. In execution year 2021, PARETO will capture produced water management, i.e., capturing options for coordinating water deliveries in a given development area. By execution year 2022, the project will shift its attention towards produced water treatment. Finally, execution year 2023 will be dedicated to produced water beneficial reuse.
In terms of deliverables, PARETO itself will be released as free and open-source software every year of the initiative – with increasing capabilities and functionality becoming available over time. The project team is also committed to conducting case studies with industrial and other partners; and where possible findings from those collaborations will be shared with the produced water community as best practice reports.
It should also be noted that the project will be continuously evaluated by a comprehensive stakeholder board that involves individuals representing upstream operators, midstream organizations, treatment technology providers, beneficial reuse entities, regulatory agencies and others – all of which will guide the project team and provide necessary input.
To install the PARETO framework on Windows operating systems, follow the set of instructions below
that are appropriate for your needs. If you need assistance please contact start a new discussion on
our GitHub Discussion form or send
an email to pareto-support@project-pareto.org.
The installation instructions vary slightly depending on the role you will have with Project Pareto.
Below are the roles we’ve identified:
Users: Use the PARETO platform to develop models, but never contribute to
development of the framework (i.e. never commit changes to the project-pareto
repo). This includes people who only work with protected data.
Core-dev: Work primarily on PARETO platform development and never handle
protected data.
Hybrid: Handle protected data, but also commit changes to the project-pareto
repo (even occasionally) - needs approval from PhD. Markus Drouven
Open the Anaconda Prompt (Start -> “Anaconda Prompt”).
Warning
If you are using Python for other complex projects, you may want to
consider using environments of some sort to avoid conflicting
dependencies. There are several good options including conda
environments if you use Anaconda.
Install PARETO with pip by one of the following methods
To get the latest release:
pipinstallproject-pareto
To get a specific release, for example 1.6.3:
pipinstallproject-pareto==1.6.3
If you need unreleased cutting-edge development versions of PARETO, you
can install PARETO directly from the GitHub repo either from the main
PARETO repo or a developer’s fork and branch (this installs from GitHub
but does not create a local git clone/workspace):
In this new project-pareto directory, run the following command which
installs PARETO in editable mode so that developers can make changes and
push to their fork/branch:
We use Sphinx for writing and building our on-line documentation.
This is a tool that translates a set of plain text .rst (reStructuredText) files into various output formats, such as HTML or PDF
(via Latex).
After installing as a Core-dev or Users (as described above)
you can build the documentation locally on your system by running the make command in the docs
folder, as follows:
$ cd project-pareto/docs/
$ make html
Visit the Sphinx Style Guide for information on
syntax rules, tips, and FAQ.
The Produced Water Application for Beneficial Reuse, Environmental Impact and Treatment Optimization (PARETO) is specifically designed for produced water management and beneficial reuse. The major deliverable of this project will be an open-source, optimization-based, downloadable and executable produced water decision-support application, PARETO, that can be run by upstream operators, midstream companies, technology providers, water end users, research organizations and regulators.
PARETO is designed as an executable optimization-based decision-support application. In return for specifying select input data, users will be provided with specific, actionable recommendations as program outputs. The table below summarizes representative inputs and outputs.
It should be noted that PARETO users will be able to choose from a range of objectives for their optimization runs; these can range from minimizing costs to maximing the ruese of produced water (or combinations thereof).
Given a fixed network of pads (completion and/or production), storage tanks, water forecasts (both consumption and production), and distribution options (trucks and/or pipelines), the operational water management model provides insight into the operational costs associated with water management. The operational model allows the user to explore the tradeoff between minimizing costs (distribution, storage, treatment, disposal, etc.) and maximizing reuse water.
Operational Model Mathematical Program Formulation
The default objective function for this produced water operational model is to minimize costs, which includes operational costs associated with procurement of fresh water, the cost of disposal, trucking and piping produced water between well pads and treatment facilities, and the cost of storing, treating and reusing produced water. A credit for using treated water is also considered, and additional slack variables are included to facilitate the identification of potential issues with input data.
This constraint sets the storage level at the completions pad. For each completions pad and for each time period, completions pad storage is equal to storage in last time period plus water put in minus water removed. If it is the first time period, the pad storage is the initial pad storage.
This constraint has not actually been implemented yet.
Storage Site Truck Offloading Capacity: ∀p ∈ S, t ∈ T
For each storage site and each time period, the volume of water being trucked into the storage site must be below the trucking offloading capacity for that storage site.
For each storage site and each time period, the volume of water being trucked into the storage site must be less than the processing capacity for that storage site.
If there are individual production tanks, the water level must be tracked at each tank. The water level at a given tank at the end of each period is equal to the water level at the previous period plus the flowback supply forecast at the pad minus the water that is drained. If it is the first period, it is equal to the initial water level.
For individual production tanks: ∀(p,a) ∈ PAL, t ∈ T
If there are individual production tanks, the water drained across all tanks at the completions pad must be equal to the produced water for transport at the pad.
The constraint proposed above is not necessary but included to facilitate switching between (1) an equalized production tank version and (2) a non-equalized production tank version.
Production Pad Supply Balance: ∀p ∈ PP, t ∈ T
All produced water must be accounted for. For each production pad and for each time period, the volume of outgoing water must be equal to the produced water transported out of the production pad.
Completions Pad Supply Balance (i.e. Flowback Balance): ∀p ∈ CP, t ∈ T
All flowback water must be accounted for. For each completions pad and for each time period, the volume of outgoing water must be equal to the forecasted flowback produced water for the completions pad.
Flow balance constraint (i.e., inputs are equal to outputs). For each pipeline node and for each time period, the volume water into the node is equal to the volume of water out of the node.
Technically this constraint should only be enforced for truly reversible arcs (e.g. NCA and CNA); and even then it only needs to be defined per one reversible arc (e.g. NCA only and not NCA and CNA).
For each storage site and for each time period, if it is the first time period, the storage level is the initial storage. Otherwise, the storage level is equal to the storage level in the previous time period plus water inputs minus water outputs.
The total stored water in a given time period must be less than the capacity. If the storage capacity limits the feasibility, the slack variable will be nonzero, and the storage capacity will be increased to allow a feasible solution.
The total disposed water in a given time period must be less than the capacity. If the disposal capacity limits the feasibility, the slack variable will be nonzero, and the disposal capacity will be increased to allow a feasible solution.
The total treated water in a given time period must be less than the capacity. If the treatment capacity limits the feasibility, the slack variable will be nonzero, and the treatment capacity will be increased to allow a feasible solution.
The total water for beneficial reuse in a given time period must be less than the capacity. If the beneficial reuse capacity limits the feasibility, the slack variable will be nonzero, and the beneficial reuse capacity will be increased to allow a feasible solution.
For each freshwater source, for each completions pad, and for each time period, the freshwater sourcing cost is equal to all output from the freshwater source times the freshwater sourcing cost.
For each disposal site, for each time period, the disposal cost is equal to all water moved into the disposal site multiplied by the operational disposal cost. Total disposal cost is the sum of disposal costs over all time periods and all disposal sites.
For each treatment site, for each time period, the treatment cost is equal to all water moved to the treatment site multiplied by the operational treatment cost. The total treatments cost is the sum of treatment costs over all time periods and all treatment sites.
Water input into treatment facility is treated with a level of efficiency, meaning only a given percentage of the water input is outputted to be reused at the completions pads.
Completions reuse water is all water that meets completions pad demand, excluding freshwater. Completions reuse cost is the volume of completions reused water multiplied by the cost for reuse.
Trucking cost between two locations for time period is equal to the trucking volume between locations in time t divided by the truck capacity [this gets # of truckloads] multiplied by the lead time between two locations and hourly trucking cost.
The constraints above explicitly consider freshwater trucking via FCT arcs.
Slack Costs:
Weighted sum of the slack variables. In the case that the model is infeasible, these slack variables are used to determine where the infeasibility occurs (e.g. pipeline capacity is not sufficient).
An extension to this operational optimization model measures the water quality across all locations over time. As of now, water quality is not a decision variable. It is calculated after optimization of the operational model.
The process for calculating water quality is as follows: the operational model is first solved to optimality, water quality variables and constraints are added, flow rates and storage levels are fixed to the solved values at optimality, and the water quality is calculated.
Note
Fixed variables are denoted in purple in the documentation.
Assumptions:
Water quality at a production pad or completions pad remains the same across all time periods
When blending flows of different water quality, they blend linearly
Treatment does not affect water quality
Water Quality Sets
\(\textcolor{blue}{w ∈ W}\) Water Quality Components (e.g., TDS)
Water Quality Parameters
\(\textcolor{green}{v_{l,w,[t]}}\) = Water quality at well pad
\(\textcolor{green}{ξ_{l,w}}\) = Initial water quality at storage
Water Quality Variables
\(\textcolor{red}{Q_{l,w,t}}\) = Water quality at location
Disposal Site Water Quality ∀k ∈ K, w ∈ W, t ∈ T
The water quality of disposed water is dependent on the flow rates into the disposal site and the quality of each of these flows.
The water quality at storage sites is dependent on the flow rates into the storage site, the volume of water in storage in the previous time period, and the quality of each of these flows. Even mixing is assumed, so all outgoing flows have the same water quality. If it is the first time period, the initial storage level and initial water quality replaces the water stored and water quality in the previous time period respectively.
The water quality at treatment sites is dependent on the flow rates into the treatment site, the efficiency of treatment, and the water quality of the flows. Even mixing is assumed, so all outgoing flows have the same water quality. The treatment process does not affect water quality
where \(\textcolor{green}{ϵ_{r,w}^{Treatment}}\) <1
Network Node Water Quality ∀n ∈ N, w ∈ W, t ∈ T
The water quality at nodes is dependent on the flow rates into the node and the water quality of the flows. Even mixing is assumed, so all outgoing flows have the same water quality.
Beneficial Reuse Options: This term refers to the reuse of water at mining facilities, farms, etc.
Completions Demand: Demand set by completions pads. This demand can be met by produced water, treated water, or freshwater.
Completions Reuse Water: Water that meets demand at a completions site. This does not include freshwater or water for beneficial reuse.
Network Nodes: These are branch points for pipelines only.
Note
Well pads are not a subset of network nodes.
[t]: This notation indicates that timing of capacity expansion has not yet been implemented.
Terminal Storage Level: These are goal storage levels for the final time period. Without this, the storage levels would likely be depleted in the last time period.
Given a set of existing network components (completion pads, storage pads, production pads, and distribution options like trucks and/or pipelines) and capacity expansion options, the strategic water management model provides an insight into financial opportunities and mid-long term investment decisions to reduce operational costs or maximize reuse or reduce fresh water consumption.
Beneficial reuse options: This term refers to the reuse of water at mining facilities, farms, etc.
Completions demand: Demand set by completions pads. This demand can be met by produced water, treated water, or freshwater.
Completions reuse water: Water that meets demand at a completions site. This does not include freshwater or water for beneficial reuse.
Network Nodes: These are branch points for pipelines only.
Note
Well pads are not a subset of network nodes.
\([\textcolor{blue}{t}]\)or\([\textcolor{blue}{t \in T}]\): This notation indicates that timing of capacity expansion has not yet been implemented.
Terminal storage level: These are goal storage levels for the final time period. Without this, the storage levels would likely be depleted in the last time period.
Water boosting: Moving large volumes of water requires water pumps. Water boosting refers to the infrastructure required to maintain water pressure.
\(\textcolor{green}{\tau_{k}^{Disposal}}\) = Disposal construction or expansion lead time
\(\textcolor{green}{\tau_{s}^{Storage}}\) = Storage construction or expansion lead time
\(\textcolor{green}{\tau_{l,\tilde{l}}^{Pipeline}}\) = Pipeline construction or expansion lead time
\(\textcolor{green}{\tau_{l,\tilde{l}}^{Trucking}}\) = Drive time between two locations
\(\textcolor{green}{\lambda_{s}^{Storage}}\) = Initial storage level at storage site
\(\textcolor{green}{\lambda_{p}^{PadStorage}}\) = Initial storage level at completions site
\(\textcolor{green}{\theta_{s}^{Storage}}\) = Terminal storage level at storage site
\(\textcolor{green}{\theta_{p}^{PadStorage}}\) = Terminal storage level at completions site
\(\textcolor{green}{\kappa_{k,i}^{Disposal}}\) = Disposal construction or expansion capital cost for selected capacity increment
\(\textcolor{green}{\kappa_{s,c}^{Storage}}\) = Storage construction or expansion capital cost for selected capacity increment
\(\textcolor{green}{\kappa_{r,j}^{Treatment}}\) = Treatment construction or expansion capital cost for selected capacity increment
The cost parameter for expanding or constructing new pipeline capacity is structured differently depending on model configuration settings. If the pipeline cost configuration is distance based:
\(\textcolor{green}{\kappa^{Pipeline}}\) = Pipeline construction or expansion capital cost [currency/(diameter-distance)]
\(\textcolor{green}{\mu_{d}^{Pipeline}}\) = Pipeline diameter installation or expansion increments [diameter]
Otherwise, if the pipeline cost configuration is capacity based:
\(\textcolor{green}{\kappa_{l,\tilde{l},d}^{Pipeline}}\) = Pipeline construction or expansion capital cost for selected diameter capacity [currency/(volume/time)]
\(\textcolor{green}{\delta_{d}^{Pipeline}}\) = Increments for installation/expansion of pipeline capacity [volume/time]
Two objective functions can be considered for the optimization of a produced water system: first, the minimization of costs, which includes operational costs associated with procurement of fresh water, the cost of disposal, trucking and piping produced water between well pads and treatment facilities, and the cost of storing, treating and reusing produced water. Capital costs are also considered due to infrastructure build out such as the installation of pipelines, treatment, and storage facilities. A credit for (re)using treated water is also considered, and additional slack variables are included to facilitate the identification of potential issues with input data. The second objective is the maximization of water reused which is defined as the ratio between the treated produced water that is used in completions operations and the total produced water coming to surface.
The annualization rate is calculated using the formula described at this website: https://www.investopedia.com/terms/e/eac.asp.
The annualization rate takes the discount rate (rate) and the number of years the CAPEX investment is expected to be used (life) as input.
Completions Pad Demand Balance:\(\forall \textcolor{blue}{p \in CP}, \textcolor{blue}{t \in T}\)
Completions pad demand can be met by trucked or piped water moved into the pad in addition to water in completions pad storage. For each completions pad and for each time period, completions demand at the given pad is equal to the sum of all piped and trucked water moved into the completions pad plus water removed from the pad storage minus water put into the pad storage plus a slack.
Completions Pad Storage Balance:\(\forall \textcolor{blue}{p \in CP}, \textcolor{blue}{t \in T}\)
Sets the storage level at the completions pad. For each completions pad and for each time period, completions pad storage is equal to storage in last time period plus water put in minus water removed. If it is the first time period, the pad storage is the initial pad storage.
For each storage site and each time period, the volume of water being trucked into the storage site must be below the trucking offloading capacity for that storage site.
\[\sum_{p \in P | (p, s) \in PST}\textcolor{red}{F_{p,s,t}^{Trucked}}
+ \sum_{p \in P | (p, s) \in CST}\textcolor{red}{F_{p,s,t}^{Trucked}}
\leq \textcolor{green}{\sigma_{s}^{Offloading,Storage}}\]
Storage Site Processing Capacity:\(\forall \textcolor{blue}{s \in S}, \textcolor{blue}{t \in T}\)
For each storage site and each time period, the volume of water being piped and trucked into the storage site must be less than the processing capacity for that storage site.
\[ \begin{align}\begin{aligned}\sum_{n \in N | (n, s) \in NSA}\textcolor{red}{F_{n,s,t}^{Piped}}
+ \sum_{r \in R | (r, s) \in RSA}\textcolor{red}{F_{r,s,t}^{Piped}}
+ \sum_{p \in P | (p, s) \in PST}\textcolor{red}{F_{p,s,t}^{Trucked}}\\ + \sum_{p \in P | (p, s) \in CST}\textcolor{red}{F_{p,s,t}^{Trucked}}
\leq \textcolor{green}{\sigma_{s}^{Processing,Storage}}\end{aligned}\end{align} \]
Production Pad Supply Balance:\(\forall \textcolor{blue}{p \in PP}, \textcolor{blue}{t \in T}\)
All produced water must be accounted for. For each production pad and for each time period, the volume of outgoing water must be equal to the forecasted produced water for the production pad.
\[ \begin{align}\begin{aligned}\textcolor{green}{\beta_{p,t}^{Production}}
= \sum_{n \in N | (p, n) \in PNA}\textcolor{red}{F_{p,n,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (p, \tilde{p}) \in PCA}\textcolor{red}{F_{p,\tilde{p},t}^{Piped}}
+ \sum_{\tilde{p} \in P | (p, \tilde{p}) \in PPA}\textcolor{red}{F_{p,\tilde{p},t}^{Piped}}\\ + \sum_{\tilde{p} \in P | (p, \tilde{p}) \in PCT}\textcolor{red}{F_{p,\tilde{p},t}^{Trucked}}
+ \sum_{k \in K | (p,k) \in PKT}\textcolor{red}{F_{p,k,t}^{Trucked}}
+ \sum_{s \in S | (p,s) \in PST}\textcolor{red}{F_{p,s,t}^{Trucked}}\\ + \sum_{r \in R | (p,r) \in PRT}\textcolor{red}{F_{p,r,t}^{Trucked}}
+ \sum_{o \in O | (p,o) \in POT}\textcolor{red}{F_{p,o,t}^{Trucked}}
+ \textcolor{red}{S_{p,t}^{Production}}\end{aligned}\end{align} \]
All flowback water must be accounted for. For each completions pad and for each time period, the volume of outgoing water must be equal to the forecasted flowback produced water for the completions pad.
\[ \begin{align}\begin{aligned}\textcolor{green}{\beta_{p,t}^{Flowback}}
= \sum_{n \in N | (p, n) \in CNA}\textcolor{red}{F_{p,n,t}^{Piped}}
+ \sum_{c \in C | (p, c) \in CCA}\textcolor{red}{F_{p,c,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (p, \tilde{p}) \in CCT}\textcolor{red}{F_{p,\tilde{p},t}^{Trucked}}\\ + \sum_{k \in K | (p, k) \in CKT}\textcolor{red}{F_{p,k,t}^{Trucked}}
+ \sum_{s \in S | (p, s) \in CST}\textcolor{red}{F_{p,s,t}^{Trucked}}
+ \sum_{r \in R | (p, r) \in CRT}\textcolor{red}{F_{p,r,t}^{Trucked}}
+ \textcolor{red}{S_{p,t}^{Flowback}}\end{aligned}\end{align} \]
Flow balance constraint (i.e., inputs are equal to outputs). For each pipeline node and for each time period, the volume water into the node is equal to the volume of water out of the node.
\[ \begin{align}\begin{aligned}\sum_{p \in P | (p, n) \in PNA}\textcolor{red}{F_{p,n,t}^{Piped}}
+ \sum_{p \in P | (p, n) \in CNA}\textcolor{red}{F_{p,n,t}^{Piped}}
+ \sum_{\tilde{n} \in N | (\tilde{n}, n) \in NNA}\textcolor{red}{F_{\tilde{n},n,t}^{Piped}}\\ + \sum_{s \in S | (s, n) \in SNA}\textcolor{red}{F_{s,n,t}^{Piped}}
+ \sum_{r \in R | (r, n) \in RNA}\textcolor{red}{F_{r,n,t}^{Piped}}\\ = \sum_{\tilde{n} \in N | (n, \tilde{n}) \in NNA}\textcolor{red}{F_{n,\tilde{n},t}^{Piped}}
+ \sum_{p \in P | (n, p) \in NCA}\textcolor{red}{F_{n,p,t}^{Piped}}
+ \sum_{k \in K | (n, k) \in NKA}\textcolor{red}{F_{n,k,t}^{Piped}}\\ + \sum_{r \in R | (n, r) \in NRA}\textcolor{red}{F_{n,r,t}^{Piped}}
+ \sum_{s \in S | (n, s) \in NSA}\textcolor{red}{F_{n,s,t}^{Piped}}
+ \sum_{o \in O | (n, o) \in NOA}\textcolor{red}{F_{n,o,t}^{Piped}}\end{aligned}\end{align} \]
There can only be flow in one direction for a given pipeline arc in a given time period. Flow is only allowed in a given direction if the binary indicator for that direction is “on”.
Technically this constraint should only be enforced for truly reversible arcs (e.g. NCA and CNA); and even then it only needs to be defined per one reversible arc (e.g. NCA only and not NCA and CNA).
Storage Site Balance:\(\forall \textcolor{blue}{s \in S}, \textcolor{blue}{t \in T}\)
For each storage site and for each time period, if it is the first time period, the storage level is the initial storage. Otherwise, the storage level is equal to the storage level in the previous time period plus water inputs minus water outputs.
For \(t = 1\):
\[ \begin{align}\begin{aligned}\textcolor{red}{L_{s,t}^{Storage}}
= \textcolor{green}{\lambda_{s,t=1}^{Storage}}
+ \sum_{n \in N | (n, s) \in NSA}\textcolor{red}{F_{n,s,t}^{Piped}}
+ \sum_{r \in R | (r, s) \in RSA}\textcolor{red}{F_{r,s,t}^{Piped}}
+ \sum_{p \in P | (p, s) \in PST}\textcolor{red}{F_{p,s,t}^{Trucked}}\\ + \sum_{p \in P | (p, s) \in CST}\textcolor{red}{F_{p,s,t}^{Trucked}}
- \sum_{n \in N | (s, n) \in SNA}\textcolor{red}{F_{s,n,t}^{Piped}}
- \sum_{p \in P | (s, p) \in SCA}\textcolor{red}{F_{s,p,t}^{Piped}}
- \sum_{k \in K | (s, k) \in SKA}\textcolor{red}{F_{s,k,t}^{Piped}}\\ - \sum_{r \in R | (s, r) \in SRA}\textcolor{red}{F_{s,r,t}^{Piped}}
- \sum_{o \in O | (s, o) \in SOA}\textcolor{red}{F_{s,o,t}^{Piped}}
- \sum_{p \in P | (s, p) \in SCT}\textcolor{red}{F_{s,p,t}^{Trucked}}
- \sum_{k \in K | (s, k) \in SKT}\textcolor{red}{F_{s,k,t}^{Trucked}}\end{aligned}\end{align} \]
For \(t > 1\):
\[ \begin{align}\begin{aligned}\textcolor{red}{L_{s,t}^{Storage}}
+ \textcolor{red}{L_{s,t-1}^{Storage}}
+ \sum_{n \in N | (n, s) \in NSA}\textcolor{red}{F_{n,s,t}^{Piped}}
+ \sum_{r \in R | (r, s) \in RSA}\textcolor{red}{F_{r,s,t}^{Piped}}
+ \sum_{p \in P | (p, s) \in PST}\textcolor{red}{F_{p,s,t}^{Trucked}}\\ + \sum_{p \in P | (p, s) \in CST}\textcolor{red}{F_{p,s,t}^{Trucked}}
- \sum_{n \in N | (s, n) \in SNA}\textcolor{red}{F_{s,n,t}^{Piped}}
- \sum_{p \in P | (s, p) \in SCA}\textcolor{red}{F_{s,p,t}^{Piped}}
- \sum_{k \in K | (s, k) \in SKA}\textcolor{red}{F_{s,k,t}^{Piped}}\\ - \sum_{r \in R | (s, r) \in SRA}\textcolor{red}{F_{s,r,t}^{Piped}}
- \sum_{o \in O | (s, o) \in SOA}\textcolor{red}{F_{s,o,t}^{Piped}}
- \sum_{p \in P | (s, p) \in SCT}\textcolor{red}{F_{s,p,t}^{Trucked}}
- \sum_{k \in K | (s, k) \in SKT}\textcolor{red}{F_{s,k,t}^{Trucked}}\end{aligned}\end{align} \]
\(\textcolor{green}{\delta_{d}^{Pipeline}}\) can be input by user or calculated. If the user chooses to calculate pipeline capacity, the parameter will be calculated by the equation below where \({\textcolor{green}{\kappa_{l,\tilde{l}}}}\) is Hazen-Williams constant and \(\omega\) is Hazen-Williams exponent as per Cafaro & Grossmann (2021) and d represents the pipeline diameter as per the set \(\textcolor{blue}{d \in D}\).
This constraint accounts for the expansion of available storage capacity or installation of storage facilities. If expansion/construction is selected, expand the capacity by the set expansion amount. The water level at the storage site must be less than this capacity. As of now, the model considers that a storage facility is expanded or built at the beginning of the planning horizon. The \(C_0\) notation indicates that we also include the 0th case, meaning that there is no selection in the set \(\textcolor{blue}{C}\) where no capacity is added.
This constraint accounts for the expansion of available disposal sites or installation of new disposal sites. If expansion/construction is selected, expand the capacity by the set expansion amount. The total disposed water in a given time period must be less than this new capacity.
Similarly to disposal and storage capacity construction/expansion constraints, the current treatment capacity can be expanded as required or new facilities may be installed.
Water input into treatment facility is treated with a level of efficiency, meaning only a given percentage of the water input is outputted to be reused at the completions pads.
For each freshwater source, for each completions pad, and for each time period, the freshwater sourcing cost is equal to all output from the freshwater source times the freshwater sourcing cost.
The total fresh sourced volume is the sum of freshwater movements by truck and pipeline over all time periods, completions pads, and freshwater sources.
For each disposal site, for each time period, the disposal cost is equal to all water moved into the disposal site multiplied by the operational disposal cost. Total disposal cost is the sum of disposal costs over all time periods and all disposal sites.
For each treatment site, for each time period, the treatment cost is equal to all water moved to the treatment site multiplied by the operational treatment cost. The total treatments cost is the sum of treatment costs over all time periods and all treatment sites.
Completions reuse water is all water that meets completions pad demand, excluding freshwater. Completions reuse cost is the volume of completions reused water multiplied by the cost for reuse.
\[ \begin{align}\begin{aligned}\textcolor{red}{C_{p,t}^{CompletionsReuse}}
= (\sum_{n \in N | (n, p) \in NCA}\textcolor{red}{F_{n,p,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (\tilde{p}, p) \in PCA}\textcolor{red}{F_{\tilde{p},p,t}^{Piped}}
+ \sum_{r \in R | (r, p) \in RCA}\textcolor{red}{F_{r,p,t}^{Piped}}\\ + \sum_{s \in S | (s, p) \in SCA}\textcolor{red}{F_{s,p,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (\tilde{p}, p) \in CCA}\textcolor{red}{F_{\tilde{p},p,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (\tilde{p}, p) \in CCT}\textcolor{red}{F_{\tilde{p},p,t}^{Trucked}}\\ + \sum_{\tilde{p} \in P | (\tilde{p}, p) \in PCT}\textcolor{red}{F_{\tilde{p},p,t}^{Trucked}}
+ \sum_{s \in S | (s, p) \in SCT}\textcolor{red}{F_{s,p,t}^{Trucked}}) \cdot \textcolor{green}{\pi_{p}^{CompletionsReuse}}\end{aligned}\end{align} \]
Note
Freshwater sourcing is excluded from completions reuse costs.
The total reuse volume is the total volume of produced water reused, or the total water meeting completions pad demand over all time periods, excluding freshwater.
Trucking cost between two locations for time period is equal to the trucking volume between locations in time \(\textcolor{blue}{t}\) divided by the truck capacity [this gets # of truckloads] multiplied by the lead time between two locations and hourly trucking cost.
Disposal Construction or Capacity Expansion Cost:\(\forall \textcolor{blue}{t \in T}\)
Cost related to expanding or constructing new disposal capacity. Takes into consideration capacity increment, cost for selected capacity increment, and if the construction/expansion is selected to occur.
Storage Construction or Capacity Expansion Cost:\(\forall \textcolor{blue}{t \in T}\)
Cost related to expanding or constructing new storage capacity. Takes into consideration capacity increment, cost for selected capacity increment, and if the construction/expansion is selected to occur.
Treatment Construction or Capacity Expansion Cost:\(\forall \textcolor{blue}{t \in T}\)
Cost related to expanding or constructing new treatment capacity. Takes into consideration capacity increment, cost for selected capacity increment, and if the construction/expansion is selected to occur.
Pipeline Construction or Capacity Expansion Cost:\(\forall \textcolor{blue}{t \in T}\)
Cost related to expanding or constructing new pipeline capacity is calculated differently depending on model configuration settings.
If the pipeline cost configuration is capacity based, pipeline expansion cost is calculated using capacity increments, cost for selected capacity increment, and if the construction/expansion is selected to occur.
If the pipeline cost configuration is distance based, pipeline expansion cost is calculated using pipeline distances, pipeline diameters, cost per inch mile, and if the construction/expansion is selected to occur.
Weighted sum of the slack variables. In the case that the model is infeasible, these slack variables are used to determine where the infeasibility occurs (e.g. pipeline capacity is not sufficient).
Completions reuse deliveries at a completions pad in time period \(\textcolor{blue}{t}\) is equal to all piped and trucked water moved into the completions pad, excluding freshwater.
\(\forall \textcolor{blue}{p \in CP}, \textcolor{blue}{t \in T}\)
Disposal deliveries for disposal site \(\textcolor{blue}{k}\) at time \(\textcolor{blue}{t}\) is equal to all piped and trucked water moved to the disposal site \(\textcolor{blue}{k}\).
\(\forall \textcolor{blue}{k \in K}, \textcolor{blue}{t \in T}\)
Beneficial reuse deliveries for beneficial reuse site \(\textcolor{blue}{o}\) at time \(\textcolor{blue}{t}\) is equal to all piped and trucked water moved to the beneficial reuse site \(\textcolor{blue}{o}\).
\(\forall \textcolor{blue}{o \in O}, \textcolor{blue}{t \in T}\)
Completions deliveries destination for completions pad \(\textcolor{blue}{p}\) at time \(\textcolor{blue}{t}\) is equal to all piped and trucked water moved to the completions pad.
\(\forall \textcolor{blue}{p \in CP}, \textcolor{blue}{t \in T}\)
\[ \begin{align}\begin{aligned}\textcolor{red}{F_{p,t}^{CompletionsDestination}}
= \sum_{n \in N | (n, p) \in NCA}\textcolor{red}{F_{n,p,t}^{Piped}}
+ \sum_{\tilde{p} \in P | (\tilde{p}, p) \in PCA}\textcolor{red}{F_{\tilde{p},p,t}^{Piped}}
+ \sum_{s \in S | (s, p) \in SCA}\textcolor{red}{F_{s,p,t}^{Piped}}\\ + \sum_{\tilde{p} \in P | (\tilde{p}, p) \in CCA}\textcolor{red}{F_{\tilde{p},p,t}^{Piped}}
+ \sum_{r \in R | (r, p) \in RCA}\textcolor{red}{F_{r,p,t}^{Piped}}
+ \sum_{f \in F | (f, p) \in FCA}\textcolor{red}{F_{f,p,t}^{Sourced}}\\ + \sum_{\tilde{p} \in P | (\tilde{p}, p) \in PCT}\textcolor{red}{F_{\tilde{p},p,t}^{Trucked}}
+ \sum_{s \in S | (s, p) \in SCT}\textcolor{red}{F_{s,p,t}^{Trucked}}
+ \sum_{\tilde{p} \in P | (\tilde{p}, p) \in CCT}\textcolor{red}{F_{\tilde{p},p,t}^{Trucked}}\\ + \sum_{f \in F | (f, p) \in FCT}\textcolor{red}{F_{f,p,t}^{Trucked}}
+ \textcolor{red}{F_{p,t}^{PadStorageOut}}-\textcolor{red}{F_{p,t}^{PadStorageIn}}\end{aligned}\end{align} \]
An extension to this strategic optimization model measures the water quality across all locations over time. As of now, water quality is not a decision variable. It is calculated after optimization of the strategic model.
The process for calculating water quality is as follows: the strategic model is first solved to optimality, water quality variables and constraints are added, flow rates and storage levels are fixed to the solved values at optimality, and the water quality is calculated.
Note
Fixed variables are colored purple in the documentation.
Assumptions:
Water quality of produced water from production pads and completions pads remains the same across all time periods
When blending flows of different water quality, they blend linearly
Treatment does not affect water quality
Water Quality Sets
\(\textcolor{blue}{w \in W}\) Water Quality Components (e.g., TDS)
\(\textcolor{blue}{p^{IntermediateNode} \in CP}\) Intermediate Completions Pad Nodes
\(\textcolor{blue}{p^{PadStorage} \in CP}\) Pad Storage
Water Quality Parameters
\(\textcolor{green}{\nu_{p,w,[t]}}\) = Water quality at well pad
\(\textcolor{green}{\xi_{s,w}^{StorageSite}}\) = Initial water quality at storage
\(\textcolor{green}{\xi_{p,w}^{PadStorage}}\) = Initial water quality at pad storage
Water Quality Variables
\(\textcolor{red}{Q_{l,w,t}}\) = Water quality at location
Disposal Site Water Quality\(\forall \textcolor{blue}{k \in K}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
The water quality of disposed water is dependent on the flow rates into the disposal site and the quality of each of these flows.
Storage Site Water Quality\(\forall \textcolor{blue}{s \in S}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
The water quality at storage sites is dependent on the flow rates into the storage site, the volume of water in storage in the previous time period, and the quality of each of these flows. Even mixing is assumed, so all outgoing flows have the same water quality. If it is the first time period, the initial storage level and initial water quality, respectively, replace the water stored and water quality in the previous time period.
Treatment Site Water Quality\(\forall \textcolor{blue}{r \in R}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
The water quality at treatment sites is dependent on the flow rates into the treatment site, the efficiency of treatment, and the water quality of the flows. Even mixing is assumed, so all outgoing flows have the same water quality. The treatment process does not affect water quality.
The water quality at nodes is dependent on the flow rates into the node and the water quality of the flows. Even mixing is assumed, so all outgoing flows have the same water quality.
Completions Pad Intermediate Node Water Quality\(\forall \textcolor{blue}{p \in P}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
Water Quality at Completions Pads
Water that is piped and trucked to a completions pad is mixed and split into two output streams: Stream (1) goes to the completions pad and stream (2) is input to the completions storage.
This mixing happens at an intermediate node. Finally, water that meets completions demand comes from two inputs: The first input is output stream (1) from the intermediate step. The second is outgoing flow from the storage tank.
The water quality at the completions pad intermediate node is dependent on the flow rates of water from outside of the pad to the pad. Even mixing is assumed, so the water to storage and water to completions input have the same water quality.
Completions Pad Input Node Water Quality\(\forall \textcolor{blue}{p \in P}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
The water quality at the completions pad input is dependent on the flow rates of water from pad storage and water from the intermediate node. Even mixing is assumed, so all water into the pad is of the same water quality.
Completions Pad Storage Node Water Quality\(\forall \textcolor{blue}{p \in P}, \textcolor{blue}{w \in W}, \textcolor{blue}{t \in T}\)
The water quality at pad storage sites is dependent on the flow rates into the pad storage site, the volume of water in pad storage in the previous time period, and the quality of each of these flows. Even mixing is assumed, so the outgoing flow to completions pad and water in storage at the end of the period have the same water quality. If it is the first time period, the initial storage level and initial water quality, respectively, replace the water stored and water quality in the previous time period.
In the previous chapter a model for tracking the water quality was shown. Without fixing the flows this model is non-linear. By discretizing the number of water qualities for all locations over time we can make the model linear again.
The discretization works as follows.
Take for example this term from the Disposal Site Water Quality:
Discrete Max Disposal Destination ∀l in L, t in T, w in W, q in Q
For each location in time only for one discrete quality there can be water injected at the disposal site and at most the capacity for that disposal site. For all the others it is equal to zero.
Cafaro, D. C., & Grossmann, I. (2021). Optimal design of water pipeline networks for the development of shale gas resources. AIChE Journal, 67(1), e17058.
PARETO tutorials are currently under development and will be made publicly available as soon as possible.
Since PARETO project is an open source project and we are in collaboration with WaterTAP and IDAES projects, the IDAES tutorials and WaterTAP tutorials are good learning materials.
PARETO project provides a set of user-friendly utility methods to display and analyze results. These methods include debugging tools, plotting utilities, and Python-Excel interfaces.
This method uses Pandas methods to read data for sets and parameters from an
Excel spreadsheet. Sets are assumed to not have neither a header nor an index column.
In addition, the data should be placed in column A, row 2, for example:
Parameters can be in either table or column format. Table format requires a header (usually time periods) and index columns whose elements should be contained in a set. Each index column should be labeled with a header starting in cell A2. Spreadsheet names for sets should be used as headers; however, generic keywords “NODES” and “INDEX” can also be used. Column format requires that each set be placed in one column, starting from cell A3. Spreadsheet names for sets should be used as headers in row 2 for each column “NODES” and “INDEX” can also be used. Data should be provided in the last column, and the keyword “VALUE” should be used as header.
This method outputs a dictionary that contains a list for each set
and a dictionary that contains parameters in the following format:
{‘param1’: {(set1, set2): value}, ‘param1’: {(set1, set2): value}}
This method checks if the elements included in a table or parameter have been defined as part of the
Sets that index such parameter. set_consistency_check() raises a TypeError exception If there are entries in the Parameter that are not
contained in the Sets, and prints out a list with all the entries that require revision.
How to Use:
The method requires one specified parameter (e.g. ProductionRates) AND one OR several sets over which
the aforementioned parameter is declared (e.g.ProductionPads, ProductionTanks, TimePeriods). In general,
the method can be run as follows: set_consistency_check(Parameter, set_1, set_2, etc)
This method allows the user to request drive distances and drive times using Bing maps API and
Open Street Maps API.
The method accept the following input arguments:
- origin:
REQUIRED. Data containing information regarding location name, and coordinates
latitude and longitude. Two formats are acceptable:
{(origin1,”latitude”): value1, (origin1,”longitude”): value2} or
{origin1:{“latitude”:value1, “longitude”:value2}}
The first format allows the user to include a tab with the corresponding data
in a table format as part of the workbook casestudy.
destination:
OPTIONAL. If no data for destination is provided, it is assumed that the
origins are also destinations.
api:
OPTIONAL. Specify the type of API service, two options are supported:
OPTIONAL. Define the parameters that the method will output. The user can select:
‘time’: A list containing the drive times between the locations is returned
‘distance’: A list containing the drive distances between the locations is returned
‘time_distance’: Two lists containing the drive times and drive distances between the locations is returned
If not output is specified, ‘time_distance’ is the default
fpath:
OPTIONAL. od_matrix() will ALWAYS output an Excel workbook with two tabs, one that
contains drive times, and another that contains drive distances. If not path is
specified, the excel file is saved with the name ‘od_output.xlsx’ in the current
directory.
create_report:
OPTIONAL. if True an Excel report with drive distances and drive times is created
This method identifies the type of model: [strategic, operational], creates a printing list based on is_print,
and creates a dictionary that contains headers for all the variables that will be included in an Excel report.
The dictionaries are used to create separate excel sheets which categorize the data by variable name or type.
This same data is put into excel sheets named after each variable as well as an overview sheet which contains totals and Key Performance Indicators (KPI) information.
Warning
If an indexed variable is added or removed from a model, the printing lists and headers should be updated
accordingly.
The output of this method prints out each variable’s information in the terminal as specified by the user, as shown below.
Sankey diagrams are a graphic tool used to easily visualize supply-sink flows across a given infrastructure (source/destination).
The relative width of each “flow” is proportional to the amount of water that is being transported between locations.
Such diagrams are commonly used to visualize the complex nature of money, energy or material flows.
This method receives the final lists for source, destination, value, and labels to be used
in generating the Sankey diagram. It also receives arguments that determine font size and
plot titles. The user can save the Sankey diagram in the following formats: jpg, jpeg, pd, png, svg, and html. Html format is set by default.
How to Use:
# Creating links and nodes based on the passed in lists to be used as the data for generating the Sankey diagramlink=dict(source=source,target=destination,value=value)node=dict(label=label,pad=30,thickness=15,line=dict(color="black",width=0.5))data=go.Sankey(link=link,node=node)# Assigning sankey diagram to fig variablefig=go.Figure(data)fig.write_html("first_figure.html",auto_open=True)
This method receives data in the form of 3 seperate lists (origin, destination, value lists), generate_report dictionary
output format, or get_data dictionary output format. It then places this data into 4 lists of unique elements so that
proper indexes can be assigned for each list so that the elements will correspond with each other based off of the indexes.
These lists are then passed into the outlet_flow method which gives an output which is passed into the method to generate the
sankey diagram.
Figure 3. Example of Sankey Diagram Showing Water Production Flows
How to Use:
This method requires two parameters:
1.) An input data dictionary that includes the time periods requested as well as said data. The data is passed in as ‘pareto_var’ and can be in get_data() format, which requires labels, generate_report() format, or 3 separate lists:
“pareto_var” – This parameter can be variable data returned from the get_data() or generate_report() format
“time_period” – This is used to specify which time periods from the data that the user wants shown in the diagram. If the user passes no time periods in, then all time periods are used in the data.
“labels” – This is only required if the data being passed in is in get_data() format. The labels are used to distinguish between the columns.
2.) A dictionary of arguments that include formatting options like font size, title of the plot and output file:
output_file – This parameter is used for creating the file that contains the Sankey Diagram created by this method
This method generates a bar chart based on the variable data that the user passes in. It automatically creates either an animated bar chart (if the variable is indexed by time) or a static bar chart.
1.) A dictionary including the data and labels that are being used, either in get_data() output format or generate_report() output format. (Labels only required for get_data() format).
“pareto_var”– This parameter contains the data that the user wants to use
“labels”– This is a tuple that contains the labels for each column of the data provided.
2.) A dictionary of arguments that include the title of the plot, a group by parameter, and an output file. Here is an example of the arguments:
“group_by” - This specifies what field will be used as the x axis in the plot
“output_file” - This parameter is used for creating the file that contains the Bar Chart created by this method.
“y_axis” - This specifies if the user wants to take the logarithm of the y axis. If not provided, then the y axis remains the default(linear).
This method creates the scatter plot that is generated from the variable data that the user passes in. It creates either an animated scatter plot(if the variable is indexed by time) or a static scatter plot.
Figure 5. Animated Scatter Chart. Notice the time period slider at the bottom.
How to Use
This method requires two parameters:
1.) An input data dictionary that include the variables for x and y axis, a size parameter, and labels parameters that provides a tuple of labels (only required for get_data() format) for x, y, and size variables.
“pareto_var”– This parameter contains the data that the user wants to use.
“labels”– This is a tuple that contains the labels for each column of the data provided.
“size”- This specifies what will be used for the size of each individual marker on the plot. If the size parameter is not provided, a default size is given to all the markers. There are 3 options for the size parameter:
“x/y” - This specifies that size will be calculated as a ratio of the x variable data over the y variable data
“y/x” - This specifies that size will be calculated as a ratio of the y variable data over the x variable data
A Pareto variable that contains data for the size of the bubbles. The data must match the column used for grouping the data in the option “group_by”.
Figure 6. Options for specifying the bubbles size.
2.) A dictionary of arguments that include the title of the plot, a group by parameter, and an output file. Here is an example of the arguments:
“group_by” - This specifies what field will be used as the x axis in the plot. The column name should be used to indicate how to group the data.
If “group_by” is not specified, then first column is used.
“output_file” - This parameter is used to name a file that the figure will be output to. It can be a file path such as “..\first_figure.html” or just the file name itself “first_figure.html”.
There will always need to be a specified extension to the file. The accepted file extensions are as follows: .html, .png, .jpg, .jpeg, .svg, .pdf
“print_data” - The PARETO methods allow the user to specify if they want the plotted data to be printed in the console (default is False):
True: The dataframe used for creating the figure is printed in the console
Figure 7. Setting print_data to True will print out a dataframe for easy inspection.
“group_by_category” - This specifies how the color of the nodes will be assigned for easy visualization. There are 3 options:
True: This will cause the color of the chart markers to be grouped based on the names of the nodes. For example: PP, CP, N, R, S, K, etc
will be assigned a unique color.
False: The data won’t be categorized by color, therefore one color will be used for the chart markers.
A Pareto variable containing a custom categorization. The method will recognize the variable automatically and the values in this variable
will be used for assigning colors to the categories that are provided. An excel sheet should be created with all Node names, removing all duplicates,
and assigning a numerical value to each specific node with the category the user would like it to be associated with. This approach is best for
the situations where nodes of different types are to be categorized together.
PARETO project provides examples to run the operational produced water management model
and the strategic produced water management model (see pareto/case_studies/).
To run the examples, go to:
For Python 3.8 and maybe others, you can get an error when running Jupyter on Windows 10 about
missing the win32api DLL. There is a relatively easy fix:
PARETO Copyright (c) 2021, by the software owners: The Regents of the University of California,
through Lawrence Berkeley National Laboratory, et al. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted
provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions
and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions
and the following disclaimer in the documentation and/or other materials provided with the
distribution.
Neither the name of the Produced Water Application for Beneficial Reuse Environmental Impact and
Treatment Optimization (PARETO), University of California, Lawrence Berkeley National Laboratory,
U.S. Dept. of Energy, nor the names of its contributors may be used to endorse or promote
products derived from this software without specific prior written permission.
THIS SOFTWARE S PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES: LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the
features, functionality or performance of the source code (“Enhancements”) to anyone; however, if
you choose to make your Enhancements available either publicly, or directly to Lawrence Berkeley
National Laboratory, without imposing a separate written license agreement for such Enhancements,
then you hereby grant Lawrence Berkeley National Laboratory the following license: a non-exclusive,
royalty-free perpetual license to install, use, modify, prepare derivative works, incorporate into
other computer software, distribute, and sublicense such enhancements or derivative works thereof,
in binary and source code form
PARETO was produced under the DOE Produced Water Application for Beneficial Reuse Environmental
Impact and Treatment Optimization (PARETO), and is copyright (c) 2021 by the software owners: The
Regents of the University of California, through Lawrence Berkeley National Laboratory, et al. All
rights reserved.
NOTICE. This Software was developed under funding from the U.S. Department of Energy and the
U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted
for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license
in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform
publicly and display publicly, and to permit other to do so.