Multi-Scale Soft Modeling Approach To Prediction of Rock Density. Step I: Mathematical Modeling of Natural Density On Specimen Scale Using Gene Expression Programming

Multi-Scale Soft Modeling Approach to Prediction of Rock Density.
Step I:
Mathematical Modeling of Natural Density on Specimen Scale Using Gene Expression
Programming
Mostafa Asefi a,*, Farshad Rashidi Nejad b
a Department of Mining Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
b School of Mining Engineering, Faculty of Engineering, The University of New South Wales, Sydney, Australia
A R T I C L E I N F O A B S T R A C T
Rock density is one of the most important variables during a mine planning process. In this
- study, a comprehensive approach was developed to predict the rock density distribution by
modeling its behavior on different scales. According to the multi-scale approach, the natural
behavior of density should first be modeled mathematically on an intact specimen scale (Step
I). Then this model is used as a base for geostatistical studies and prediction of natural density
for large blocks on a region scale (Step II). Finally, the natural density is modified and
Keywords: converted to bulk density based on structural situation of each block (Step III). The paper
Rock density presented herein particularly deals with modeling of natural density on the specimen scale by
Modeling Gene Expression Programming method. The Sangan iron ore deposit in Iran was considered
Intact specimen as a case study. The data sets used in training and testing stages are randomly selected among
Gene Expression Programming all experimental data. In the developing of the proposed model six parameters including
Iron Ore Deposit percentages of Fe, CaO, LOI, and SiO2, mean depth of samples, and Fe: FeO ratio were
incorporated. It was observed that a four-gene GEP-RNC algorithm gives the best results. For
the best model mean absolute error (MAE) and correlation coefficient (R) were found 0.088
and 0.970, respectively. Sensitivity analysis was also performed to understand the effect of
each influencing parameters on the natural density.
1. Introduction During resource modelling studies, three techniques are

available for determining material density (Hustrulid et al, 2013):
Density is defined as the ratio of the mass of a material to its
volume. That is, density expresses the relationship between the - Technique 1. Careful excavation and weighing of a large
mass of something and its size. volume.
Density determinations of ore and waste, during mine planning, - Technique 2. Calculation based upon composition
require close attention because they directly affect the (mineralogy) using published densities.
conversion of volumes to tonnages. In this case, prediction on a - Technique 3. Density testing of small samples in the
large scale such as mining blocks rapidly, accurately and laboratory.
precisely is desirable. Density distribution of ore is needed in Technique 1 is the only way to measure the bulk density on a
order to estimate ore reserves and long-term planning, and rock mass scale and provides the best site specific results, but it
density distribution in waste rock-types is critical to accurately is the most expensive and time consuming and it also severely
determining stripping ratios, mining costs and scheduling. reduces the capability of the density distribution modeling.
Evaluation of a mineral resource is very sensitive to density In Technique 2, care must be taken to correct for porosity and
distribution. Normally, sample points will support three moisture. Furthermore, minerals possess differences in physical
estimated (kriged) points (Moon, 2006). Thus, on a 100 meter properties, such as density, according to different crystallization
drilling grid, for example, block size would be 25 meters. In such process. This feature is considerable in high density minerals,
a 25 25 meter block with a 15 meter bench height, if the especially when using the technique on a large scale estimation.
prediction error of density exceeds 0.1, it can lead to a 937.5 For Technique 3, there are two primary tests which are done.
tonne error in the block mass estimation. In the first, the sample is first weighed in air (M). The sample
* Corresponding author. Tel.: +98 81 38331046.

E-mail addresses: m.asefi@srbiau.ac.ir (M. Asefi),
f.rashidinejad@unsw.edu.au (F. Rashidi Nejad),
volume (V) is then determined by water displacement. The - nnatural density: the mean density of the considered
density is then calculated: intact rock specimen (including intergranular porosity);
M - bbulk density: the mean dry density of the considered rock
= (1) mass volume (including discontinuities created by tectonic
V and thermal stresses);
In the second type of test, the sample is first weighed in air (Ma) It should be noted that same definitions can be considered for
and then weighed when suspended in water at standard specific gravity in cases in which specific gravity values have been
temperature and pressure (Ms). In this case, the specific gravity measured and recorded in exploration database. In this way, the
(SG) instead of the density is calculated by means of the equation: numeric values of natural and bulk specific gravity are identical,
Ma respectively, with the numeric values of natural and bulk density.
SG = (2) In an intact rock specimen consisting of n components, the
Ma Ms following equation presents the n value:
In fact, specific gravity, a term that is widely used n
interchangeably with density, is relative density, and is therefore = [(1 ). i . Vi ] + . (3)
i=1
unitless. In its mineralogic use, specific gravity applies to solid,
nonporous materials; consequently, use of the term should be Where is the total intergranular porosity, Vi is the volume
avoided in general reference to ore and waste, with deference to fraction of component i and f is the mean density of pore fluid.
the term bulk density. Of course, when porosity is negligible, the For a rock mass, the volume fraction for the total discontinuities
numeric value of specific gravity is identical with the numeric is given by discontinuity and b is computed using the following
value of density (Sinclair and Blackwell, 2004). equation:
Rock densities range from less than 2.0 g/cc for soft sediments
and coals to more than 3.0 g/cc for mafic and ultramafic rocks. = (1 ). (4)
Many ore minerals, particularly metal sulfides and oxides, are Most magmatic (intrusive) and metamorphic rocks have almost
very much denser than the minerals that make up the bulk of no intergranular porosity. Formed by crystallization, the grains
most rocks, and orebodies are thus often denser than their intergrow tightly, leaving almost no void space. Typically, granite
surroundings (Moon, 2006). after formation has a minimal porosity = 0.001, most of which
From this view of point, in iron ore deposits, for example, which occurs as small irregular cavities that are remnants of the
contain high density minerals such as magnetite and hematite, it crystallization process (Schon, 2011).
is normal practice to use Fe grade to predict density. This Volcanic (extrusive) rocks are different. Rapid cooling and
approach, however, is not accurate and precise enough because, pressure decrease can result in porosity. Typical volcanic rocks
on the one hand, it ignores the effect of porosity and, on the other, are porous basalts.
in this manner density is predicted based on the spatial Tectonic and thermal stresses can create later fractures and
anisotropy model of kriged Fe. cracksthey represent planar discontinuities, occupy a very
Instead, an approximation that is considered to be the most small volume fraction, but can create a connected network on a
accurate, not only in iron deposits but also in all cases, should rock mass scale.
cover the following issues: Intergranular porosity of the rocks can be quickly estimated
- An intact rock specimen is composed of many minerals. Every from petrographic studies; conversely, characterization of
mineral has a limit range of density. Due to differences in the fractures is difficult and additional parameters describing
percentages of minerals and intergranular porosity, the geometry and orientation are necessary.
density of the same type of rocks differs within a certain Thus, the natural density of intact rock (n) is considered as a
range. function of its components and intergranular porosity on a
- In a rock mass, due to the distribution of fractures and cracks, specimen scale, and spatial anisotropy parameters of n
the value of density differs from the density of parent rocks in distribution on specific region called geological domain.
the intact form. Such discontinuities in a rock mass have zero Supposing a domain is either a 2D or 3D region within which all
density and must be weighted appropriately in the data is related from a specific point of view (such as geology in
determination of bulk density. this case).
- In a geological region, density should be considered as a Bulk density, In contrast to natural density, is not an intrinsic
regionalized variable. So, on a region scale, the density can quality of rocks. It depends on structural phenomena within
only be estimated based on its own spatial anisotropy model. specific region called structural domain.
In this study, a comprehensive approach has been developed to
deal with these related features of density and present a reliable 2.2. Methodology
bulk density for mining blocks. In the next section the general
structure of this approach is presented. The MSM approach (three-letter acronym for Multi-Scale Soft
Modeling) is a step-by-step methodology in which one scale is
2. The MSM Approach considered in each step and the model created in one scale
prepares the input for the next scale. Note that the prediction of
2.1. Terminology density (b) is only completed after modifying the natural density
of each block to the bulk density according to the structural
situation of the related domain.
First, it is necessary to distinguish between different densities
The fundamental steps of the MSM approach focusing on step I,
that are related to different rock components:
which is the subject of this paper, are schematically represented
in Fig. 1.
- idensity of any individual mineral in a rock component i;
Fig. 1. The flowchart of the MSM approach.
Step I begins with the domain separation process. It is very (Anomaly A, Anomaly B, Anomaly C-north, and Anomaly C-south),
important to recognize separate domains within an area, and Central and Eastern. This study focuses on Anomaly B in Western
group all sample data contained within each domain into distinct zone which is, at present, under development.
subsets, and use data from each separate domain to make
estimations within that domain. 3.1. Domains
Then, the scale of intact specimen should be determined. The
intact specimen scale for each domain is defined as a sample The Sangan Deposit has been classified as a skarn deposit. The
length through which the rock properties remained intact. If the original host rock was probably impure limestone or dolomite
number of sample data having this length (or shorter) is enough that underwent alteration.
to distinguish the spatial anisotropy in the density distribution on The iron ore mineralization in Anomaly B occurs in a zone of
the domain scale, then the algorithm is terminated in this step, thickened lenses of rich (massive) and poor (disseminated)
and the sample data can directly be used in the step II. Otherwise, magnetite that range from 130 m to 300 m thick. The dominant
the natural density of intact specimens should be modeled gangue minerals are represented by calcite, dolomite and quartz.
mathematically. In this case, the step I is terminated only after Chemically, after Fe, the dominant components are CaO, MgO,
extending the model over all samples, and preparing enough data SiO2, and Al2O3.
for geostatistical study in the next step. Anomaly B is capped by oxidized magnetite (hematite). At
In the MSM approach, from the beginning and for helping its surface, oxidation has created a pervasive thin layer that has
generality in an environment of uncertainty and imprecision like formed minerals such as hematite, limonite, goethite and
a mineral deposit, a soft viewpoint in all modelling steps should secondary carbonates. The oxidization has lowered the overall
be considered. specific gravity of the ore. The oxidation zone is characterized by
From the next section onward, the stages of the step I will be the ratio of total Fe to FeO that is greater than 7 in oxidized ore
discussed and implemented in a real case. samples.
The iron also occurs in the carbonate form as siderite (FeCO3)
and ankerite (Ca (Fe, Mg) (CO3)2, particularly in the disseminated
3. Case Study
or magnetite poor mineralized zones. These carbonates are not
The Sangan iron ore deposit is located in east-northeast Iran; easily identified optically and X-Ray Diffraction analysis has been
approximately 30 km from the Afghanistan border. This deposit used. Also, the ratio of total Fe to FeO indicates the presence of
lies in an area of approximately 30 km length and 8 km width and iron carbonates. Most unoxidized ore samples have a Fe: FeO
its geographical coordinates are 3424' north and 6016' east. ratio <2.33, which is the theoretical maximum for magnetite. The
Three principal zones of mineralization have been identified low ratio value indicates the presence of reduced iron in the
within the Sangan deposit and are referred to as Western carbonates. This is important because metallurgical recovery of
the iron decreases with the rising proportion of iron in the non- 4. Overview of Gene Expression Programming
recoverable carbonates.
There are sufficient Statistical, geochemical and mineralogical 4.1. Basic Difference between GAs, GP and GEP
distinctions between oxidized, non-oxidized and reduced iron
zone to warrant separate domain classification as follow: Genetic algorithms (GAs) were invented by John Holland in the
- Domain RD: Reduced Ore, Fe/FeO <2.33. 1960s with the main idea of survival of the fittest (Holland,
- Domain UX: Unoxidized Ore, 2.33<Fe/FeO<7; 1992) and were then popularized by Goldberg (Goldberg, 1989).
- Domain OX: Oxidized Ore, Fe/FeO >7; Like all evolutionary computer systems, GAs are an
The modeling of natural density in this study has been oversimplification of biological evolution. In this case, solutions
performed in Domain RD. The methodology presented herein can to a problem are usually encoded in fixed length strings of 0s and
be used, however, to deal with other domains. 1s (chromosomes), and populations of such strings are
manipulated in order to evolve a good solution to a particular
3.2. Database problem (Ferreira). The fitness of each chromosome in a GA is the
measure the individual has been adapted to the problem.
Prior to half-core splitting during the assaying process, a small Selecting of individuals is done randomly with a probability so
section 50 mm of whole core (chosen as intact specimen) was that best ones are chosen. Modification in the original GA was
sent to the laboratory for SGn measurement. The SGn was introduced by mutation, crossover, and inversion.
determined using the weight in air versus weight in water The chromosomes of GAs are not only keepers of the genetic
method. If the rock was porous, the core was covered in wax first, information that is replicated and transmitted with modification
but this was only done on rare occasions. The specimen was also to the next generation, but are also the object of selection. In fact,
analyzed for other parameters such as the percentages of Fe, FeO, they function simultaneously as genome and phenome. The
P, etc. variety of functions GAs chromosomes are able to play is severely
After classifying the data and removing outliers present, Only limited by this dual role and by their structural organization. For
70 samples remained in Domain RD. Statistical and geological instance, it wouldnt be possible in such systems to use only a
analysis of SGn data indicated that it strongly influenced by the particular region of the genome as a solution to the problem: the
parameters listed in Table 1. Summary statistics as well as their whole genome is always the solution. Obviously these systems
descriptions can also be seen in the table. are severely constrained (Ferreira, 2006).
It is worth pointing out, however, that some of the SGn property Genetic Programming (GP) is an extension to genetic
variables, especially rock components, may be fundamentally algorithms proposed by Koza that creates computer programs
interdependent. High positive or negative correlation coefficients without being explicitly programmed using the principle of
between the pairs may lead to poor performance of the models Darwinian natural selection (Koza 1992). The difference between
and difficulty in interpreting the effects of the explanatory GP and GA is related to the representation of the solution. A string
variables on the response. This interdependency can cause of numbers is created by GA to represent the solution, while the
problems in analysis as it will tend to exaggerate the strength of GP solutions are computer programs commonly represented as
relationships between variables. This is a simple case commonly tree structures (parse tree).
known as the problem of multicollinearity (Musavi, 2012). So, in GP, the genetic operators act directly on the parse tree
Thus, in the process of choosing appropriate predictor and it greatly limits this technique. In fact, the pallet of genetic
variables, first the correlation coefficients between all possible operators available to GP is very limited, because most of them
pairs are determined. Then, between each two related variables, would result in invalid parse trees. A GP-specific crossover is
the variable with less influence on SGn was removed from the practically the only genetic operator used in most GP
modeling process. implementations. For this reason, in GP, huge populations of
To achieve generalization capability and prevent over fitting, parse trees are used with the aim of creating all the necessary
the datasets were grouped into two sets as training (80%) and building blocks with the inception of the initial population in
testing (20%) sets. The patterns used in the test and training sets order to guarantee the discovery of a solution only by moving
are randomly selected. these initial building blocks around (Ferreira, 2001).
Gene Expression Programming was then utilized to obtain In 2001 Ferreira proposed gene expression programming
meaningful relationships between the natural density of ore in (GEP) by drawing lessons from the gene expression rule of
Domain RD and the influencing parameters as follows: biological heredity (Ferreira, 2001). GEP is an inevitable
development of GAs and GP; it uses the same kind of diagram
SGn = f(Fe, Fe/FeO, MD, SiO2 , CaO, LOI)
representation of GP, but the entities produced by GEP
The GEP algorithm can substantially be useful for this purpose (expression trees) are the expression of a genome that is a linear
by directly extracting the knowledge contained in the string of fixed length. This approach can overcome the
experimental data. In the next section, the fundamental insufficiency of lacking the function complexity in the GAs system
mechanism of GEP algorithm will briefly introduced. and the insufficiency that genetic operation is carried on with
difficulty in the GP system.
Table 1.
Description and statistics of the input and output parameters in the GEP model.
Parameter Description Symbol Analytical Technique Range Mean Precision
Input Fe % d0 Wet chemical digestion, Titration 22.1462.76 47.00 0.01
Fe/FeO ratio d1 Fe/FeO* 1.662.31 2.09 0.01
Mean Depth (m) d2 Geometric Calculation 22.3324.2 157.24 0.01
SiO2 % d3 Wet chemical digestion, Weight difference 0.4410.57 3.87 0.01
CaO % d4 Wet chemical digestion, Titration** 0.0934.7 16.20 0.01
Loss on Ignition % d5 Heating (1050), Weight difference 1.0628.62 14.97 0.01
Output Natural Specific Gravity SGn Weight in air versus weight in water 3.174.59 3.90 0.01
* Measured by the same method as Fe
** Measured on the residue after dissolution of Si
4.2. Chromosome and Expression Tree Third, from left to right, the nodes are filled, in the same order,
with the elements of the gene. Fourth, the process is repeated
The leading roles in GEP are the chromosome (genotype) and until a line containing only terminals is formed.
the expression tree (phenotype). They may mutually transform Consider, for example, a gene for which the set of functions F =
through the Karva language. {Q, *, /, -, +}, where Q represents the square root function, and
GEP chromosomes consist of a linear, symbolic string of fixed the set of terminals (variables) T = {a, b}. In this case, n = 2; and if
length composed of one or more genes, each gene coding for a we chose an h = 6, then t = 7. Thus, the length of the gene is
structurally and functionally unique expression tree (sub-ET). So, 6+7=13 and the length of a chromosome composed of three genes
the complete expression of a chromosome requires the is 3*13=39.
interaction of these sub-ETs using a linking function. Thanks to Fig. 2. Illustrates an instance of such chromosome and its
the Karva language, it is possible to infer immediately an ET given Expression encoding sub-ETs linked by addition. It is worth
the sequence of a gene, and vice versa. noticing that, each gene has the potential to code for ETs of
A gene is made up of two parts, which are head and tail. The different sizes and shapes. In gen 1, for example, ET ends at
head is composed of both the functions and the terminals, but tail position 2, whereas gene ends at position 13.
merely includes the terminals. The terminal set consists
obviously of the independent variables. The choice of the 4.3. Genetic Operators
appropriate function set is not so obvious, but a good guess can
always be done in order to include all the necessary functions. The elements of the next population are generated by means of
The length of head h in a gene is chosen, whereas the length of four primary operators: selection and replication, mutation,
tail t is a function of h and n, n is the number of arguments of transposition, and recombination (cross-over).
the function with more arguments (also to be called the biggest The selection and replication are heredity operators with
operand) and t is obtained by the following equation: general sense. During Selection, the roulette is spun as many
times as there are individuals in the population, maintaining
t = h(n 1) (5)
always the same population size. During replication, the genomes
The phenotype is an ET, which is transformed from the of the selected individuals are copied as many times as the
genotype according to the syntax rules. First, the start of a gene outcome of the roulette.
corresponds to the root of the ET, forming this node the first line. Mutations can occur anywhere in the chromosome. However,
Second, depending on the number of arguments to each element the structural organization of chromosomes must remain intact.
(functions may have a different number of arguments, whereas In the heads, any symbol can change into another (function or
terminals have nothing), in the next line are placed as many terminal); in the tails, terminals can only change into terminals.
nodes as there are arguments to the functions in the previous line.
Fig. 2. Expression of multi-genic chromosome encoding sub-ETs linked by a two-argument function. a) A three-genic chromosome with the tails shown in bold. b) The sub-
ETs codified by each gene. c) The result of posttranslational linking with addition (phenotype of the chromosome). The linking functions are shown in gray. d) The multi-
subunit ET encoded as genotype of the chromosome.
The transposable elements of GEP are fragments of the genome chosen on the basis of a multi-objective strategy as follows
that can be activated and jump to another place in the (Musavi, 2012):
chromosome. In GEP there are three kinds of transposable
elements: i) short fragments with a function or terminal in the i. Selecting the simplest model, although this is not a
first position that transpose to the head of genes except the root predominant factor.
(insertion sequence elements or IS elements); ii) short fragments ii. Providing the best fitness value on the training data.
with a function in the first position that transpose to the root of iii. Providing the best fitness value on the validation data.
genes (root IS elements or RIS elements); iii) and entire genes The first objective can be controlled by the user through the
that transpose to the beginning of chromosomes. parameter settings (e.g., head size or number of genes). For the
In GEP there are three kinds of recombination: one-point other objectives, the Root Relative Squared Error (RRSE) fitness
recombination, two-point recombination and gene recombination. function is used.
In all types of recombination, two chromosomes are randomly Mathematically, the root relative squared error of an individual
chosen and paired to exchange some material between them, program i is evaluated by the equation:
creating two new daughter chromosomes. Usually the daughter
chromosomes are as different from each other as they are from 2
(nj=1(Pij Tj ) )
their parents (Ferreira, 2001). RRSEi = n (6)
(j=1(Tj T)2 )
4.4. Numerical constants
Where Pij is the value predicted by the individual program i for
Numerical constants are an integral part of most mathematical fitness case j (out of n fitness cases or sample cases); Tj is the
models. In GEP, there are two different approaches to the is given by the formula:
target value for fitness case j; and T
problem of constant creation: one without using ephemeral
nj=1 Tj
random constants, and another using ephemeral random =
T (7)
constants (Ferreira, 2001). In the first approach, a special facility n
to handle numerical constants was implemented. In the second The RRSE index ranges from 0 to infinity, with 0 corresponding
approach, an additional domain Dc was created. Structurally, the to the ideal fit. As it stands, RRSEi cannot be used directly as
Dc comes after the tail, has a length equal to t, and consists of the fitness since, for fitness proportionate selection, the value of
symbols used to represent the ephemeral random constants. fitness must increase with efficiency. Thus, for evaluating
For each gene, the random numerical constraints (RNCs) are the fitness fi of an individual program i, the following equation
generated during the inception of the initial population and kept is used:
in an array. The values of each random constant are only assigned
during gene expression. Furthermore, a special operator is used 1
to introduce genetic variation in the available pool of random fi = 1000. (8)
1 + RRSEi
constants by mutating the random constants directly. In addition,
the usual GEP operators (mutation, inversion, transposition, and Which ranges from 0 to 1000, with 1000 corresponding to the
recombination) plus a Dc-specific inversion and a Dc-specific ideal (Ferreira, 2006).
transposition guarantee the effective circulation of the random
constants in the population (Ferreira, 2001).
4.5. Basic GEP Algorithm
A basic representation of the GEP algorithm is shown in Fig. 3.

(Ferreira). The process begins with the random generation of the
initial population. Then these chromosomes are expressed and
the fitness of each individual is evaluated against a set of fitness
cases. The individuals are then selected according to their fitness
to reproduce with modification, leaving progeny with new traits.
These new individuals are, in their turn, subjected to the same
developmental process. The process is repeated for a certain
number of generations or until a good solution has been found.
5. The developed model
A commercial computer software GeneXproTools v5.02 was

used in this study (www.gepsoft.com). For the developing the
best model, a lot of GEP models were run by using training and
testing data. The GEP parameters for these models are given in
Table 2.
Determining the number of genes is one of the most important
issues, affecting the accuracy and application of the model
(decrease or increase in the number of genes can reduce the
accuracy or prolong the resulted equation) (Behnia et al., 2013).
According to the problem conditions, a four-gene GEP-RNC
algorithm were used.
The program was run until the maximum fitness was reached
or the runs automatically terminated. Then the best model was
Fig. 3. The flowchart of Gene Expression Programming
Table 2.
GEP parameters used for developed models.
GEP Parameters
Evolution structure Number of runs 100
Number of generations 4000
Population size 300
Number of fitness cases 56
Gene Components Terminal set Fe, Fe/FeO, MD, SiO2, CaO, LOI
Function set +, -, *, /, exp, x2, x3, , , sin, cos, arctan, tanh
Gene Structure Head length 8
Tail length 9
DC length 9
Gene length 26
Chromosome structure Number of genes 4
Linking function Addition
Operator setting Mutation rate 0.0294
Inversion rate 0.1
IS transposition rate 0.1
RIS transposition rate 0.1
Gene transposition rate 0.1
One-point recombination rate 0.3
Two-point recombination rate 0.2
Gene recombination rate 0.1
RNC Mutation 0.055
Dc Mutation 0.1
Dc Inversion 0.1
Dc IS Transposition 0.1
Numerical Constants Constants per Gene 10
Data Type Floating-Point
Lower Bound -10
Upper Bound 10
Fitness Function Function Root Relative Squared Error (RRSE)
Through running many times, the most perfect individual was coefficient (R) were calculated using the following equations:
found. This most superior chromosome was observed in the )(Ci Ci )
ni=1(Ai A
3,561th generation of the 18th run having fitness values of R=
777.91 and 782.28 for training and test data, respectively. Fig. 4 (9)
)2 ][ni=1(Ci C)2 ]
[ni=1(Ai A
shows the expression trees of the achieved individual, which can
be mapped into mathematical equations, and then written with
an appropriate programming language. The derived code in C# ni=1|Ai Ci |
MAE = (10)
language, corresponding to the figure, is represented in Fig. 5. n
Where Ai and Ci are respectively actual and calculated outputs
6. Results and discussion and C are respectively the averages of the actual
for the output i, A
and calculated outputs, and n is the number of samples or fitness
6.1. The Model predictability cases.
For the selected optimum model, MAE and R were equal to
To evaluate the accuracy of the GEP model, the testing data sets 0.088 and 0.970, respectively. Also, comparison between actual
were applied and then, mean absolute error (MAE) and correlation and calculated values of SGn is graphically illustrated in Fig. 6.
Fig. 4. Phenotype Structure of the optimal GEP model.

Fig. 5. Derived Code of the optimal GEP model in C# language.
6.2. Sensitivity analysis 7. Conclusions and future work
The contribution of each input variable to prediction of SGn is A hierarchical approach having a soft viewpoint in modeling was
evaluated through a sensitivity analysis. For this aim, frequency developed for prediction of density in rock mass, and the first
appeared in 100% of the best thirty programs evolved by GEP. A step of this approach was discussed and implemented in Sangan
frequency value equal to 1.00 for a variable indicates that this iron ore deposit. The Step I focuses on mathematical modeling of
variable has been contributed in 100% of the best thirty natural density on an intact specimen scale. The Database was
programs. This methodology is a common approach in the GP- divided into three geological domains and a domain contained
based analyses (Gandomi, 2011). The frequency values of the the reduced Ore in carbonates (RD Domain) was considered for
input variables are presented in Fig. 7. the study. Also, gene expression programing (GEP) as an
According to the figure, it is observed that input parameters, appropriate soft computing tool was proposed. A four-gene
including Fe %, SiO2% and CaO% are the most effective factors on structure of GEP with random numerical constants was found to
the rock density. Whereas Fe/FeO ratio is the least effective be optimum using a trialerror mechanism. Finally, the
parameter in this regard. The results give a good indication as to contribution of predictor parameters was investigated using
magnetite ore have been changed naturally and therefore, do not sensitivity analysis. It was observed that Fe% is the most
predominate within density distribution. influencing and Fe/FeO ratio is the least influencing parameter in
RD Domain.
(a) 5 (b 5
4.5 MAE = 0.076 4.5 MAE = 0.088

R = 0.959 R = 0.970
Predicted SGn
Predicted SGn
4 4
Linear Fit Linear Fit

3.5 3.5
3 3
2.5 2.5
2.5 3 3.5 4 4.5 5 2.5 3 3.5 4 4.5 5
Measured SGn Measured SGn
Fig. 6. Measured versus predicted natural specific gravity using the optimal GEP model: (a) training and (b) validation data.
1
0.8
Frequency
0.6
0.4
0.2
0
Fe % Fe/FeO Mean Depth LOI % SiO2 % CaO %
Series1
Value 1 0.6 0.866 0.833 0.966 0.9
Fig. 7. Contributions of the input variables in the GEP modeling.
Now the mathematical model should be used to predict the Gepsoft- Modeling made easy datamining software <www.gepsoft.com>
value of natural density in other specimens and analysis the (accessed 15.12.14).
variance relationships on the region scale. This is the subject of Goldberg, D.E., 1989. Genetic algorithms in search, Optimization and
Step II which will be discussed in the next paper. Machine Learning. New York: Addison-Wesley.
Holland, J.H., 1992. Adaptation in Natural and Artificial Systems: An
Introductory Analysis with Applications to Biology. 2nd ed. MIT
References Press.
Hustrulid, W.A., Kuchta, M., Martin, R.K., 2013. Open Pit Mine Planning
Behnia, D., Ahangari, K., Noorzad, A., Meinossadadat, S.R., 2013. Predicting and Design. 3rd ed. CRC Press/Balkema.
crest settlement in CFRDs using ANFIS and GEP intelligent methods. Koza, J.R., 1992. Genetic Programming: On the Programming of
Journal of Zhejiang University-Science A (Applied Physics & Computers by Means of Natural Selection. MIT Press.
Engineering) 14(8):589-602. Moon, C.J., Whateley, M.K.G., Evans, M.A., 2006. Introduction to mineral
Ferreira, C., 2001. Gene Expression Programming: A New Adaptive exploration. 2nd ed. Australia: Blackwell Publishing.
Algorithm for Solving Problems. Journal of Complex Systems 13(2), Musavi, S.M., Aminian, P., Gandomi, A.S., Alavi, A.H., Bolandi, H., 2012. A
87129. new predictive model for compressive strength of HPC using gene
Ferreira, C., 2006. Gene expression programming: mathematical expression programming. Advances in Engineering Software 45 105
modeling by an artificial intelligence. 2nd ed. Germany: Springer- 114
Verlag. Schon, J.H. 2011. Physical properties of rocks. Handbook of petroleum
Gandomi, A.H., Alavi, A.H., Mirzahosseini, M.R., Moqhadas Nejad, F., 2011. exploration and production, Vol. 8. Elsevier B.V.
Nonlinear genetic-based models for prediction of flow number of Sinclair, A.J., Blackwell, G.H., 2004. Applied Mineral Inventory Estimation.
asphalt mixtures. J Mater Civil Eng ASCE 23:24863. CAMBRIDGE UNIVERSITY PRESS.

Multi-Scale Soft Modeling Approach To Prediction of Rock Density. Step I: Mathematical Modeling of Natural Density On Specimen Scale Using Gene Expression Programming

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Multi-Scale Soft Modeling Approach To Prediction of Rock Density. Step I: Mathematical Modeling of Natural Density On Specimen Scale Using Gene Expression Programming

Hochgeladen von

Copyright:

Verfügbare Formate

Multi-Scale Soft Modeling Approach to Prediction of Rock Density.

1. Introduction During resource modelling studies, three techniques are

* Corresponding author. Tel.: +98 81 38331046.

4.5. Basic GEP Algorithm

A basic representation of the GEP algorithm is shown in Fig. 3.

5. The developed model

A commercial computer software GeneXproTools v5.02 was

Fig. 4. Phenotype Structure of the optimal GEP model.

6.2. Sensitivity analysis 7. Conclusions and future work

4.5 MAE = 0.076 4.5 MAE = 0.088

Linear Fit Linear Fit

Fig. 7. Contributions of the input variables in the GEP modeling.

Das könnte Ihnen auch gefallen