Title: | Perform the Complete Processing of a Set of Proton Nuclear Magnetic Resonance Spectra |
---|---|
Description: | Perform the complete processing of a set of proton nuclear magnetic resonance spectra from the free induction decay (raw data) and based on a processing sequence (macro-command file). An additional file specifies all the spectra to be considered by associating their sample code as well as the levels of experimental factors to which they belong. More detail can be found in Jacob et al. (2017) <doi:10.1007/s11306-017-1178-y>. |
Authors: | Daniel Jacob [cre, aut] |
Maintainer: | Daniel Jacob <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.6.0 |
Built: | 2025-03-26 10:36:14 UTC |
Source: | https://github.com/inra/rnmr1d |
calibSpectrum
belongs to the low-level functions group - it processes only one raw spectrum at time.
calibSpectrum(spec, zoneref, ppmref)
calibSpectrum(spec, zoneref, ppmref)
spec |
a spectrum object returned by the readSpectrum function |
zoneref |
the ppm range containing the TSP/DSS signal |
ppmref |
the ppm reference value |
spec object
checkMacroCmdFile
Check if the macro-commands included in the input file (commandfile)
are compliant with the allowed commands.
checkMacroCmdFile(commandfile)
checkMacroCmdFile(commandfile)
commandfile |
The macro-commands file - the allowed commands are : 'align', 'warp', 'clupa', 'gbaseline', 'baseline', 'qnmrbline', 'airpls', 'binning', 'calibration', 'normalisation', 'denoising', 'bucket', 'zero'. |
return 1 if the macro-commands included in the input file are compliant, 0 if not.
the NMRProcFlow online documentation https://nmrprocflow.org/ and especially the Macro-command Reference Guide (https://nmrprocflow.org/themes/pdf/Macrocommand.pdf)
data_dir <- system.file("extra", package = "Rnmr1D") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") ret <- checkMacroCmdFile(CMDFILE)
data_dir <- system.file("extra", package = "Rnmr1D") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") ret <- checkMacroCmdFile(CMDFILE)
cleanPeaks
cleans the peaks under a specified threshold and also remove redundant peaks having the same position
cleanPeaks(spec, peaks, ratioPN, keeprows = FALSE)
cleanPeaks(spec, peaks, ratioPN, keeprows = FALSE)
spec |
a 'spec' object |
peaks |
a data.frame of the input peaks |
ratioPN |
Threshold of the Peak/Noise ratio below which the peaks will be rejected |
keeprows |
indicates if row names must be preserved. |
a data.frame of the remaining peaks
computeBL
computes baseline based on the model.
computeBL(spec, model)
computeBL(spec, model)
spec |
a 'spec' object |
model |
a model object |
a vector of the baseline estimated during the deconvolution process
Initialize the deconvolution parameter list
deconvParams
deconvParams
An object of class list
of length 45.
flist
: Filter type list : 'smooth1', 'smooth2' and'smooth3' for Savitzky-Golay filter, 'daub8' and 'symlet8' for filter based on wavelet
criterion
: Criterion type for the optimizations : 0 => R2, 1 => 1/Std(residues) - default value = 0
reltol
: Criterion tolerance for the optimization - default value = 0.0001
facN
: Noise factor applied while the peak finding step - default value = NULL
ratioPN
: Peak/Noise Ratio applied while the peak selection step - default value = 1
obl
: Optimization of a baseline (BL) for each massif. 0 means no BL, an integer greater than 0 indicates the polynomial order of the BL default value = 0
distPeaks
: PeakFinder : min distance between 2 peaks (as multiple of sigma_min which is typically equal to 0.0005 ppm) - default value = 2
optim
: Indicates if optimisation is applied - default value = 1
oppm
: Indicates if ppm optimisation is applied - default value = 1
osigma
: Indicates if sigma optimisation is applied - default value = 1
d2meth
: PeakFinder : Indicates if minima method to the second derivation is applied
spcv
: PeakFinder : Maximal CV on Spectrum - default value = 0.005
d2cv
: PeakFinder : Maximum CV on the derived spectrum - default value = 0.05
d1filt
: Apply Filter (1) on the 1st derivate or not (0) - default value = 0
d2filt
: Apply Filter (1) on the 2nd derivate or not (0) - default value = 0
sigma_min
: Optimization of Sigmas : Fixe the minimum limit of sigmas - default value = 0.0005
sigma_max
: Optimization of Sigmas : Fixe the maximum limit of sigmas - default value = 0.005
verbose
: Indicates if we want information messages - default value = 1
exclude_zones
: Exclude ppm zones for the criterion evaluation - default value = NULL
detectCores
is simply a shortcut for parallel::detectCores().
detectCores(...)
detectCores(...)
... |
See |
doProcCmd
it process the Macro-commands string array specified at input.
doProcCmd(specObj, cmdstr, ncpu = 1, debug = FALSE)
doProcCmd(specObj, cmdstr, ncpu = 1, debug = FALSE)
specObj |
a complex list return by |
cmdstr |
the Macro-commands string array; See the Macro-command Reference Guide (https://nmrprocflow.org/themes/pdf/Macrocommand.pdf) to have more details about macro-commands. |
ncpu |
The number of cores [default: 1] |
debug |
a boolean to specify if we want the function to be more verbose. |
specMat
: a 'specMat' object - See the manual page of the doProcessing
function for more details on its structure
data_dir <- system.file("extra", package = "Rnmr1D") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") SAMPLEFILE <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=CMDFILE, samplefile=SAMPLEFILE, ncpu=2) # Apply an intelligent bucketing (AIBIN) specMat.new <- Rnmr1D::doProcCmd(out, c("bucket aibin 10.2 10.5 0.3 3 0", "9.5 4.9", "4.8 0.5", "EOL" ),ncpu=2, debug=TRUE) out$specMat <- specMat.new
data_dir <- system.file("extra", package = "Rnmr1D") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") SAMPLEFILE <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=CMDFILE, samplefile=SAMPLEFILE, ncpu=2) # Apply an intelligent bucketing (AIBIN) specMat.new <- Rnmr1D::doProcCmd(out, c("bucket aibin 10.2 10.5 0.3 3 0", "9.5 4.9", "4.8 0.5", "EOL" ),ncpu=2, debug=TRUE) out$specMat <- specMat.new
doProcessing
is the main function of this package. Indeed, this function performs
the complete processing of a set of 1D NMR spectra from the FID (raw data) and based on a
processing sequence (macro-command file). An additional file specifies all the spectra to
be considered by associating their sample code as well as the levels of experimental
factors to which they belong. In this way it is possible to select only a subset of spectra
instead of the whole set.
doProcessing( path, cmdfile, samplefile = NULL, bucketfile = NULL, phcfile = NULL, ncpu = 1 )
doProcessing( path, cmdfile, samplefile = NULL, bucketfile = NULL, phcfile = NULL, ncpu = 1 )
path |
The full path of either the raw spectra directory on the disk |
cmdfile |
The full path name of the Macro-commands file for processing (text format) |
samplefile |
The full path name of the Sample file (tabular format) |
bucketfile |
The full path name of the file of bucket's zones (tabular format) |
phcfile |
The full path name of the phasing file for samples if required (tabular format) |
ncpu |
The number of cores [default: 1] |
doProcessing
returns a list containing the following components:
samples
: the samples matrix with the correspondence of the raw spectra,
as well as the levels of the experimental factors if specified in the input.
factors
: the factors matrix with the corresponding factor names.
At minimum, the list contains the Samplecode label corresponding to the samples without their
group level.
rawids
: list of the full directories of the raw spectra (i.e. where the FID files
are accessible)
infos
: list of the acquisition and processing parameters for each (raw) spectra.
specMat
: objects list regarding the spectra data.
int
: the matrix of the spectra data (nspec
rows X size
columns)
nspec
: the number of spectra
size
: the size (i.e number of points) of each spectra
ppm_min
, ppm_max
: the minimum and the maximum ppm values of
spectra
ppm
: the vector of the ppm values (size
values)
dppm
: the ppm increment between each point
buckets_zones
: the matrix of the buckets zones including two columns
(min and max)
namesASintMax
: boolean - If TRUE, generate all output matrix with
bucket names based on ppm values of the maximum of the average intensity of all spectra within
the ppm range of each bucket. If FALSE (default), then bucket names will be based on the ppm
range center of each bucket.
the NMRProcFlow online documentation https://nmrprocflow.org/ and especially the Macro-command Reference Guide (https://nmrprocflow.org/themes/pdf/Macrocommand.pdf)
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2)
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2)
estimateBL
estimates of the baseline of the spectrum in the corresponding ppm range (based on the C_Estime_LB routine)
estimateBL(spec, ppmrange, WS = 50, NEIGH = 35)
estimateBL(spec, ppmrange, WS = 50, NEIGH = 35)
spec |
a 'spec' object |
ppmrange |
the ppm range in which the baseline will be estimated |
WS |
Size of the window (in number of points) from which a rolling average will be established |
NEIGH |
The minimum window size (in number of points) in which the signal compared to its mean can be considered as belonging to the baseline. |
a vector of the estimated baseline
filterByThreshold
applies a filtering based on wavelet by specifying a threshold value
filterByThreshold(s, wavelet, type = 0)
filterByThreshold(s, wavelet, type = 0)
s |
the spectral signal as a numerical vector |
wavelet |
the name of the wavelet: haar, daub2, daub4, daub8, symlet2, symlet4, symlet8 |
type |
the type of the threshold : 0 for Soft threshold, 1 for Hard threshold |
a vector of the same dimension as the entry one
filterByWT
applies a filtering based on wavelet using the universal threshold
filterByWT(s, wavelet, threshold = 0.5)
filterByWT(s, wavelet, threshold = 0.5)
s |
the spectral signal as a numerical vector |
wavelet |
the name of the wavelet: haar, daub2, daub4, daub8, symlet2, symlet4, symlet8 |
threshold |
the threshold value - default value is 0.5 |
a vector of the same dimension as the entry one
filterSavGol
applies a Savitzky-Golay filter on a spectral signal.
filterSavGol(s, m, nl, nr)
filterSavGol(s, m, nl, nr)
s |
the spectral signal as a numerical vector |
m |
the degree of the polynomial filter (integer) |
nl |
width of the sliding window on the left (integer) |
nr |
width of the sliding window on the rigth (integer) |
a vector of the same dimension as the entry one
generateMetadata
Generate the metadata from the list of raw spectra namely the samples, the experimental factors and the list of selected raw spectra. Depending on whether the sample matrix is supplied as input or not,
generateMetadata(RAWDIR, procParams, samples = NULL)
generateMetadata(RAWDIR, procParams, samples = NULL)
RAWDIR |
The full path of either the raw spectra directory on the disk |
procParams |
the list of processing parameters. First initialize this list with the |
samples |
the samples matrix with the correspondence of the raw spectra |
generateMetadata
returns a list containing the following components:
samples
: the samples matrix with the correspondence of the raw spectra, as well as the levels of the experimental factors if specified in the input.
factors
: the factors matrix with the corresponding factor names. At minimum, the list contains the Samplecode label corresponding to the samples without their group level.
rawids
: list of the full directories of the raw spectra (i.e. where the FID files are accessible)
data_dir <- system.file("extra", package = "Rnmr1D") samplefile <- file.path(data_dir, "Samples.txt") samples <- read.table(samplefile, sep="\t", header=TRUE,stringsAsFactors=FALSE) metadata <- generateMetadata(data_dir, procParams=Spec1rProcpar, samples)
data_dir <- system.file("extra", package = "Rnmr1D") samplefile <- file.path(data_dir, "Samples.txt") samples <- read.table(samplefile, sep="\t", header=TRUE,stringsAsFactors=FALSE) metadata <- generateMetadata(data_dir, procParams=Spec1rProcpar, samples)
Generates the matrix including the integrations of the areas defined by the buckets (columns) on each spectrum (rows)
getBucketsDataset(specObj, norm_meth = "none", zoneref = NA)
getBucketsDataset(specObj, norm_meth = "none", zoneref = NA)
specObj |
a complex list return by |
norm_meth |
Normalization method. The possible values are : 'none', 'CSN' or 'PDN'. See below. |
zoneref |
Specify the ppm zone of the internal reference (i.e. ERETIC) if applicable. default is NA. |
Before bucket data export in order to make all spectra comparable with each other, the variations of the overall concentrations of samples have to be taken into account. We propose two normalization methods. In NMR metabolomics, the total intensity normalization (called the Constant Sum Normalization) is often used so that all spectra correspond to the same overall concentration. It simply consists in normalizing the total intensity of each individual spectrum to a same value. An other method called Probabilistic Quotient Normalization (Dieterle et al. 2006) assumes that biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolites signals. Probabilistic Quotient Normalization (PQN) starts by the calculation of a reference spectrum based on the median spectrum. Next, for each variable of interest the quotient of a given test spectrum and reference spectrum is calculated and the median of all quotients is estimated. Finally, all variables of the test spectrum are divided by the median quotient. An internal reference can be used to normalize the data. For example, an electronic reference (ERETIC, see Akoka et al. 1999, or ERETIC2 generated with TopSpin software) can be used for this purpose. The integral value of each bucket will be divided by the integral value of the ppm range given as reference.
the data matrix
Akoka S1, Barantin L, Trierweiler M. (1999) Concentration Measurement by Proton NMR Using the ERETIC Method, Anal. Chem 71(13):2554-7. doi: 10.1021/ac981422i.
Dieterle F., Ross A., Schlotterbeck G. and Senn H. (2006). Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in 1H NMR Metabonomics. Analytical Chemistry, 78:4281-4290.doi: 10.1021/ac051632c
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) outMat <- getBucketsDataset(out, norm_meth='CSN')
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) outMat <- getBucketsDataset(out, norm_meth='CSN')
Generates the buckets table
getBucketsTable(specObj)
getBucketsTable(specObj)
specObj |
a complex list return by |
the buckets table
From the data matrix generated from the integration of all bucket zones (columns) for each spectrum (rows), we can take advantage of the concentration variability of each compound in a series of samples by performing a clustering based on significant correlations that link these buckets together into clusters. Bucket Clustering based on either a lower threshold applied on correlations or a cutting value applied on a hierarchical tree of the variables (buckets) generated by an Hierarchical Clustering Analysis (HCA).
getClusters(data, method = "hca", ...)
getClusters(data, method = "hca", ...)
data |
the matrix including the integrations of the areas defined by the buckets (columns) on each spectrum (rows) |
method |
Clustering method of the buckets. Either 'corr' for 'correlation' or 'hca' for 'hierarchical clustering analysis'. |
... |
Depending on the chosen method:
|
At the bucketing step (see above), we have chosen the intelligent bucketing, it means that each bucket exact matches with one resonance peak. Thanks to this, the buckets now have a strong chemical meaning, since the resonance peaks are the fingerprints of chemical compounds. However, to assign a chemical compound, several resonance peaks are generally required in 1D 1 H-NMR metabolic profiling. To generate relevant clusters (i.e. clusters possibly matching to chemical compounds), two approaches have been implemented:
Bucket Clustering based on a lower threshold applied on correlations
In this approach an appropriate correlation threshold is applied on the correlation matrix before its cluster decomposition. Moreover, an improvement can be done by searching for a trade-off on a tolerance interval of the correlation threshold : from a fixed threshold of the correlation (cval), the clustering is calculated for the three values (cval-dC, cval, cval+dC), where dC is the tolerance interval of the correlation threshold. From these three sets of clusters, we establish a merger according to the following rules: 1) if a large cluster is broken, we keep the two resulting clusters. 2) If a small cluster disappears, the initial cluster is conserved. Generally, an interval of the correlation threshold included between 0.002 and 0.01 gives good trade-off.
Bucket Clustering based on a hierarchical tree of the variables (buckets) generated by an Hierarchical Clustering Analysis (HCA)
In this approach a Hierachical Classification Analysis (HCA, hclust
)
is applied on the data after calculating a matrix distance ("euclidian" by default). Then, a cut
is applied on the tree (cutree
) resulting from hclust
,
into several groups by specifying the cut height(s). For finding best cut value, the cut height
is chosen i) by testing several values equally spaced in a given range of the cut height, then,
2) by keeping the one that gives the more cluster and by including most bucket variables.
Otherwise, a cut value has to be specified by the user (vcutusr)
getClusters
returns a list containing the following components:
vstats
Statistics that served to find the best value of the criterion (matrix)
clusters
List of the ppm value corresponding to each cluster. the length of the list equal to number of clusters
clustertab
the associations matrix that gives for each cluster (column 2) the corresponding buckets (column 1)
params
List of parameters related to the chosen method for which the clustering was performed.
vcrit
Value of the (best/user) criterion, i.e correlation threshold for 'corr' method or the cut value for the 'hca' method.
indxopt
Index value within the vstats matrix corresponding to the criterion value (vcrit)
Jacob D., Deborde C. and Moing A. (2013) An efficient spectra processing method for metabolite identification from 1H-NMR metabolomics data. Analytical and Bioanalytical Chemistry 405(15) 5049-5061 doi: 10.1007/s00216-013-6852-y
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) outMat <- getBucketsDataset(out, norm_meth='CSN') clustcorr <- getClusters(outMat, method='corr', cval=0, dC=0.003, ncpu=2) clusthca <- getClusters(outMat, method='hca', vcutusr=0)
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) outMat <- getBucketsDataset(out, norm_meth='CSN') clustcorr <- getClusters(outMat, method='corr', cval=0, dC=0.003, ncpu=2) clusthca <- getClusters(outMat, method='hca', vcutusr=0)
getDeconvParams
merges some specific parameters values with the full deconvolution list and return the resulting list. With no parameter as input, it returns the default parameter list.
getDeconvParams(params = NULL)
getDeconvParams(params = NULL)
params |
a list defining some specific parameters for deconvolution |
the resulting list of deconvolution parameters.
merged variables for each cluster (based on their average)
getMergedDataset(data, clustObj, onlycluster = FALSE)
getMergedDataset(data, clustObj, onlycluster = FALSE)
data |
the matrix including the integrations of the areas defined by the buckets (columns) on each spectrum (rows) |
clustObj |
a list generated by the |
onlycluster |
boolean - specifies if the merged data matrix at output must only contain the merged clusters (TRUE) or if it must also contain the buckets that are not include within a cluster (FALSE) |
Slice the spectrum in order to define ranges for Local Deconvolution (LSDeconv) and return only those include the provided ppmranges
getSlices(spec, ppmranges, flatwidth = 0.004, snrfactor = 4, maxrange = 0.3)
getSlices(spec, ppmranges, flatwidth = 0.004, snrfactor = 4, maxrange = 0.3)
spec |
a 'spec' object |
ppmranges |
ppm ranges as a matrix in order to apply the deconvolution, each row specifying a zone |
flatwidth |
specifies the minimum width of a zone in which the spectrum intensities are close to zero to consider this one as a cutting zone (default 0.003 ppm) |
snrfactor |
specifies factor applied on the Std Dev. of the noise used as threshold for first derivate intensities (default=4) |
maxrange |
specifies the maximum width of a cutting zone (default 0.3 ppm) |
a list of ppm range
Generates the Signal-Noise-Ratio dataset
getSnrDataset(specObj, zone_noise = c(10.2, 10.5), ratio = TRUE)
getSnrDataset(specObj, zone_noise = c(10.2, 10.5), ratio = TRUE)
specObj |
a complex list return by |
zone_noise |
Specify a ppm range of noisy zone default is c(10.2,10.5) |
ratio |
boolean; TRUE for output Signal-Noise Ratio, or FALSE to output maximum value of each bucket and in addition, the estimate noise as a separate column |
whatever the bucketing approach used, the Signal-to-Noise ratio is a good quality indicator. Thus, it is possible to check buckets based on their Signal-to-Noise ratio.
the Signal-Noise-Ratio matrix
Generates the spectral data matrix. The first column indicates the value of ppm, then the following columns correspond to spectral data, one column per spectrum.
getSpectraData(specObj)
getSpectraData(specObj)
specObj |
a complex list return by |
the spectral data matrix
Plots the boxplot of all clusters allowing to have an insight on the clusters distribution. Plot based on ggplot2
ggplotClusters(data, clustObj)
ggplotClusters(data, clustObj)
data |
the matrix including the integrations of the areas defined by the buckets (columns) on each spectrum (rows) |
clustObj |
a list generated by the |
Plots the curves that show the number of clusters, the number of clustered buckets and the size of biggest cluster versus the criterion, namely the correlation threshold for the 'corr' method, the cutting value for the 'hca' method.
ggplotCriterion(clustObj, reverse = FALSE)
ggplotCriterion(clustObj, reverse = FALSE)
clustObj |
a list generated by the |
reverse |
indicates if the x axis need to be reversed |
Plots the two components defined by pc1, pc2 of the matrix of variable loadings coming from a multivariable analysis, typically a Principal Component Analysis (PCA). It can also plot the ellipses corresponding to each cluster defined by the associations matrix if not null. (in fact it is the main interest of this function).
ggplotLoadings( data, pc1 = 1, pc2 = 2, EV = NULL, associations = NULL, main = "Loadings", onlylabels = FALSE, highlabels = FALSE, gcontour = "ellipse" )
ggplotLoadings( data, pc1 = 1, pc2 = 2, EV = NULL, associations = NULL, main = "Loadings", onlylabels = FALSE, highlabels = FALSE, gcontour = "ellipse" )
data |
the matrix of variable loadings coming from a multivariable analysis, typically a Principal Component Analysis (PCA) |
pc1 |
the fist component of the matrix of variable loadings to be plotted. |
pc2 |
the second component of the matrix of variable loadings to be plotted. |
EV |
Eigenvalues vector |
associations |
the associations matrix that gives for each cluster (column 2) the corresponding buckets (column 1). See |
main |
Change the default plot title on the rigth corner |
onlylabels |
if TRUE, put only the association names without drawing the cluster contours. Implies that association matrix is provided. |
highlabels |
if TRUE, put the the association names in blue, and others in grey. Implies that association matrix is provided and fONLYLABELS equal to TRUE. |
gcontour |
type of contour; possible values are : 'ellipse', 'polygon', 'ellipse2', 'none' |
Translate 'ggplot2' graphs to an interactive plotly version
ggplotPlotly(g, width = NULL, height = NULL, textposition = "right")
ggplotPlotly(g, width = NULL, height = NULL, textposition = "right")
g |
The ggplot2 graph object to be translated into an interactive plotly version |
width |
Width of the plot in pixels (optional, defaults to automatic sizing). |
height |
Height of the plot in pixels (optional, defaults to automatic sizing) |
textposition |
Position of the labels on the graphs relative to the points. Possible values are : 'right', 'left', 'top' or 'buttom' |
Plots the two components defined by pc1, pc2 of the matrix of scores coming from a multivariable analysis, typically a Principal Component Analysis (PCA).
ggplotScores( data, pc1 = 1, pc2 = 2, groups = NULL, EV = NULL, main = "Scores", glabels = FALSE, psize = 3, gcontour = "ellipse", params = list(cellipse = 0.95), colors = NULL )
ggplotScores( data, pc1 = 1, pc2 = 2, groups = NULL, EV = NULL, main = "Scores", glabels = FALSE, psize = 3, gcontour = "ellipse", params = list(cellipse = 0.95), colors = NULL )
data |
the matrix of scores coming from a multivariable analysis, typically a Principal Component Analysis (PCA) |
pc1 |
the fist component of the matrix of variable loadings to be plotted. |
pc2 |
the second component of the matrix of variable loadings to be plotted. |
groups |
the vector defining the factorial groups (same dimension as data rows) |
EV |
Eigenvalues vector |
main |
the plot main title |
glabels |
boolean indicating if labels have to be plotted |
psize |
point size |
gcontour |
type of contour; possible values are : 'ellipse', 'polygon', 'ellipse2', 'none' |
params |
parameters depending on the contour type |
Global Spectra Deconvolution: GSDeconv
belongs to the low-level functions group for deconvolution.
GSDeconv( spec, ppmrange, params = NULL, filter = "symlet8", scset = c(2, 3, 12), verbose = 1 )
GSDeconv( spec, ppmrange, params = NULL, filter = "symlet8", scset = c(2, 3, 12), verbose = 1 )
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
params |
a list of specific parameters for deconvolution |
filter |
a filter type for filtering the noise and smoothing the signal |
scset |
a set of scmin values |
verbose |
level of debug information |
a model object
Lorentz
belongs to the low-level functions group for deconvolution.
Lorentz(ppm, amp, x0, sigma, asym)
Lorentz(ppm, amp, x0, sigma, asym)
ppm |
a vector of ppm values |
amp |
amplitude of the lorentzian |
x0 |
central value of the lorentzian |
sigma |
half-width of the lorentzian |
asym |
asymetric parameter |
a vector of the lorentzian values (same size as ppm)
Local Spectra Deconvolution: LSDeconv
belongs to the low-level functions group for deconvolution.
LSDeconv( spec, ppmrange, params = NULL, filterset = c("daub8"), oblset = 0, verbose = 1 )
LSDeconv( spec, ppmrange, params = NULL, filterset = c("daub8"), oblset = 0, verbose = 1 )
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
params |
a list of specific parameters for deconvolution including or not (i.e equal to NULL) the matrix defining peaks, one peak by row, with columns defined as : pos, ppm, amp, sigma, eta |
filterset |
a set of filter type for filtering the noise and smoothing the signal (only if the matrix defining peaks not defined in order to find peaks) |
oblset |
a set of baseline order for fitting |
verbose |
level of debug information |
a model object
Multiple Local Spectra Deconvolution: MultiLSDeconv
belongs to the low-level functions group for deconvolution.
MultiLSDeconv( spec, ppmranges = NULL, params = NULL, filterset = c(7, 9), oblset = 0, ncpu = 4, verbose = 0 )
MultiLSDeconv( spec, ppmranges = NULL, params = NULL, filterset = c(7, 9), oblset = 0, ncpu = 4, verbose = 0 )
spec |
a 'spec' object |
ppmranges |
ppm ranges as a matrix in order to apply the deconvolution, each row specifying a zone |
params |
a list of specific parameters for deconvolution |
filterset |
a set of filter type for filtering the noise and smoothing the signal (only if the matrix defining peaks not defined in order to find peaks) |
oblset |
a set of baseline order for fitting |
ncpu |
number of CPU for parallel computing |
verbose |
level of debug information |
a model object
optimOneVoigt
belongs to the low-level functions group for deconvolution.
optimOneVoigt(X, Y, par)
optimOneVoigt(X, Y, par)
X |
a vector of ppm values |
Y |
a vector of intensities |
par |
a vector of the 3 pseudo-voigt parameters namely: Amplitude, central ppm value, 2 ppm widths at mid-height for mixed lorentizian and gaussian |
a vector of the pseudo-voigt parameters (same size as par)
peakFiltering
belongs to the low-level functions group for deconvolution.
peakFiltering(spec, peaks, ratioPN)
peakFiltering(spec, peaks, ratioPN)
spec |
a 'spec' object |
peaks |
the matrix defining peaks, one peak by row, with columns defined as : pos, ppm, amp, sigma, eta |
ratioPN |
the ratio Peaks/Noise for filtering |
a dataframe
peakFinder
belongs to the low-level functions group for deconvolution.
peakFinder(spec, ppmrange, params = NULL, filter = "none", verbose = 1)
peakFinder(spec, ppmrange, params = NULL, filter = "none", verbose = 1)
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
params |
a list of specific parameters for deconvolution |
filter |
a filter type for filtering the noise and smoothing the signal |
verbose |
level of debug information |
a list
peakOptimize
belongs to the low-level functions group for deconvolution.
peakOptimize(spec, ppmrange, params, verbose = 1)
peakOptimize(spec, ppmrange, params, verbose = 1)
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
params |
a list of specific parameters for deconvolution, including the matrix defining peaks, one peak by row, with columns defined as : pos, ppm, amp, sigma, eta |
verbose |
level of debug information |
a list
Plots the boxplot of all clusters allowing to have an insight on the clusters distribution
plotClusters( data, clustObj, horiz = TRUE, main = "Boxplot by clusters (log10 transformed)" )
plotClusters( data, clustObj, horiz = TRUE, main = "Boxplot by clusters (log10 transformed)" )
data |
the matrix including the integrations of the areas defined by the buckets (columns) on each spectrum (rows) |
clustObj |
a list generated by the |
horiz |
Boolean - Indicates if the plot is horizontal (TRUE) or vertical (FALSE) |
main |
Main title of the plot |
Plots the curves that show the number of clusters, the number of clustered buckets and the size of biggest cluster versus the criterion, namely the correlation threshold for the 'corr' method, the cutting value for the 'hca' method.
plotCriterion(clustObj, reverse = FALSE)
plotCriterion(clustObj, reverse = FALSE)
clustObj |
a list generated by the |
reverse |
Boolean - indicates if x-axis must be reversed (TRUE) or nor (FALSE) |
Plots the two components defined by pc1, pc2 of the matrix of variable loadings coming from a multivariable analysis, typically a Principal Component Analysis (PCA). It can also plot the ellipses corresponding to each cluster defined by the associations matrix if not null. (in fact it is the main interest of this function).
plotLoadings( data, pc1, pc2, associations = NULL, main = "Loadings", xlimu = c(min(data[, pc1]), max(data[, pc1])), ylimu = c(min(data[, pc2]), max(data[, pc2])), cexlabel = 1, pch = 20, ellipse = TRUE, level = 0.8 )
plotLoadings( data, pc1, pc2, associations = NULL, main = "Loadings", xlimu = c(min(data[, pc1]), max(data[, pc1])), ylimu = c(min(data[, pc2]), max(data[, pc2])), cexlabel = 1, pch = 20, ellipse = TRUE, level = 0.8 )
data |
the matrix of variable loadings coming from a multivariable analysis, typically a Principal Component Analysis (PCA) |
pc1 |
the fist component of the matrix of variable loadings to be plotted. |
pc2 |
the second component of the matrix of variable loadings to be plotted. |
associations |
the associations matrix that gives for each cluster (column 2) the corresponding buckets (column 1) |
main |
Change the default plot title on the rigth corner |
xlimu |
gives the limit to be plotted of the first component |
ylimu |
gives the limit to be plotted of the second component |
cexlabel |
number indicating the amount by which plotting text and symbols should be scaled relative to the default. |
pch |
Plotting Symbols |
ellipse |
boolean - specifies if ellipses are plot or not for each cluster |
level |
confidence level for plotting the ellipses |
plotModel
plots the model along with the resulting voigt functions from deconvolution
plotModel( spec, model, exclude_zones = NULL, labels = c("ppm", "id"), groups = NULL, tags = FALSE, xlab = "", ylab = "", title = "", grp_colors = NULL )
plotModel( spec, model, exclude_zones = NULL, labels = c("ppm", "id"), groups = NULL, tags = FALSE, xlab = "", ylab = "", title = "", grp_colors = NULL )
spec |
a 'spec' object (see |
model |
a 'model' object (see |
exclude_zones |
a list of vector defining the excluded zones for lorentzian plots |
labels |
choose as legend labels either 'ppm' or 'id' |
tags |
boolean allowing you to put identification tags at the top of each peak |
title |
title of the graphic |
grp_colors |
specifies the colors for the first groups and/or peaks |
Plots the two components defined by pc1, pc2 of the matrix of scores coming from a multivariable analysis, typically a Principal Component Analysis (PCA).
plotScores( data, pc1, pc2, samples, factor = NULL, cexlabel = 1.2, level = 0.95, xlim = NULL, ylim = NULL, col = NULL )
plotScores( data, pc1, pc2, samples, factor = NULL, cexlabel = 1.2, level = 0.95, xlim = NULL, ylim = NULL, col = NULL )
data |
the matrix of scores coming from a multivariable analysis, typically a Principal Component Analysis (PCA) |
pc1 |
the fist component of the matrix of variable loadings to be plotted. |
pc2 |
the second component of the matrix of variable loadings to be plotted.
as well as the levels of the experimental factors if specified in the input.
See |
samples |
the samples matrix with the correspondence of the raw spectra, |
factor |
if not null, the name of one of the columns defining the factorial groups in the samples matrix at input |
cexlabel |
number indicating the amount by which plotting text and symbols should be scaled relative to the default. |
level |
confidence level for plotting the corresponding ellipse |
xlim |
gives the limit to be plotted of the first component |
ylim |
gives the limit to be plotted of the second component |
col |
colors vector for ellipses - automatically defined by default |
plotSpec
plots all signals defined by a matrix.
plotSpec( ppmrange, x, y, ynames = c("Origin", "Filtered", "Model"), ycolors = c("grey", "blue", "red", "green", "orange", "magenta", "cyan", "darkgreen", "darkorange"), ysel = NULL, xlab = "", ylab = "", title = "" )
plotSpec( ppmrange, x, y, ynames = c("Origin", "Filtered", "Model"), ycolors = c("grey", "blue", "red", "green", "orange", "magenta", "cyan", "darkgreen", "darkorange"), ysel = NULL, xlab = "", ylab = "", title = "" )
ppmrange |
a ppm range defining the window of the plotting |
x |
a vector defining the x-axis (abscissa) |
y |
a vector or a matrix defining the y-axes (ordinates), each signal as a column |
ynames |
a vector defining the y names (same order as the y matrix) |
ycolors |
a vector defining the y colors (same order as the y matrix) |
ysel |
a vector defining the visibility of each y element (same order as the y matrix) |
title |
title of the graphic |
plotSpecMat
Plot spectra set, overlaid or stacked; if stacked, plot with or
without a perspective effect.
plotSpecMat( specMat, ppm_lim = c(min(specMat$ppm), max(specMat$ppm)), K = 0.67, pY = 1, dppm_max = 0.2 * (max(ppm_lim) - min(ppm_lim)), asym = 1, beta = 0, cols = NULL )
plotSpecMat( specMat, ppm_lim = c(min(specMat$ppm), max(specMat$ppm)), K = 0.67, pY = 1, dppm_max = 0.2 * (max(ppm_lim) - min(ppm_lim)), asym = 1, beta = 0, cols = NULL )
specMat |
a 'specMat' object - Spectra matrix in specMat$int (rows = samples, columns = buckets) |
ppm_lim |
ppm range of the plot |
K |
Graphical height of the stack (0 .. 1),(default=0.67) |
pY |
Intensity limit factor (default 1) |
dppm_max |
Max ppm shift to have a perspective effect |
asym |
Correction of vertical parallax effect (-1 .. 1) -1 : parallelogram 0 : trapeze with maximum asymmetric 1 : symmetric trapeze |
beta |
Correction of horizontal parallax effect (0 .. 0.2) (defaut 0) |
cols |
Vector of colors (same size that the number of spectra, i.e dim(specmat)[1]) |
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) # Overlaid plot plotSpecMat(out$specMat, ppm_lim=c(0.5,9), K=0, pY=0.1) # Stacked plot with perspective effect plotSpecMat(out$specMat, ppm_lim=c(-0.1,9),K=0.33) # Stacked plot with perspective effect with maximum asymmetric plotSpecMat(out$specMat, ppm_lim=c(0.5,5), K=0.33, asym=0) cols <- c(rep("red",3), rep("blue",3)) # Stacked plot with colors accordings to group levels plotSpecMat(out$specMat, ppm_lim=c(0.5,5), K=0.67, dppm_max=0, cols=cols)
data_dir <- system.file("extra", package = "Rnmr1D") cmdfile <- file.path(data_dir, "NP_macro_cmd.txt") samplefile <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(data_dir, cmdfile=cmdfile, samplefile=samplefile, ncpu=2) # Overlaid plot plotSpecMat(out$specMat, ppm_lim=c(0.5,9), K=0, pY=0.1) # Stacked plot with perspective effect plotSpecMat(out$specMat, ppm_lim=c(-0.1,9),K=0.33) # Stacked plot with perspective effect with maximum asymmetric plotSpecMat(out$specMat, ppm_lim=c(0.5,5), K=0.33, asym=0) cols <- c(rep("red",3), rep("blue",3)) # Stacked plot with colors accordings to group levels plotSpecMat(out$specMat, ppm_lim=c(0.5,5), K=0.67, dppm_max=0, cols=cols)
pos2ppm
convert an index position to the corresponding ppm value
pos2ppm(spec, index)
pos2ppm(spec, index)
spec |
a 'spec' object |
index |
an index position |
the corresponding ppm value
ppm2pos
convert a ppm value to the corresponding index position
ppm2pos(spec, ppm)
ppm2pos(spec, ppm)
spec |
a 'spec' object |
ppm |
a ppm value |
the corresponding index position
PVoigt
belongs to the low-level functions group for deconvolution.
PVoigt(ppm, amp, x0, sigma, asym, eta)
PVoigt(ppm, amp, x0, sigma, asym, eta)
ppm |
a vector of ppm values |
amp |
amplitude of the lorentzian |
x0 |
central value of the lorentzian |
sigma |
half-width of both lorentzian and gaussian |
asym |
asymetric parameter |
eta |
mixing coefficient for the pseudo-voigt function (between 0 and 1) |
a vector of the lorentzian values (same size as ppm)
read_spectrum
belongs to the low-level functions group - it processes only one raw spectrum at time.
readSpectrum( ACQDIR, procParams, ppmnoise = c(10.2, 10.5), PHC = NULL, scaleIntensity = 1, verbose = 1 )
readSpectrum( ACQDIR, procParams, ppmnoise = c(10.2, 10.5), PHC = NULL, scaleIntensity = 1, verbose = 1 )
ACQDIR |
Full directory path of the raw spectrum, i.e the directory containing the FID |
procParams |
a Spec1rProcpar list |
ppmnoise |
the ppm range containing only noise signal in order to estimate the level of the noise (S/N ratio) |
PHC |
the phasing values applied to the spectrum in the frequency domain thus avoiding the automatic phasing step. Only useful if the input signal is an FID (procParams$INPUT_SIGNAL) |
scaleIntensity |
factor applied to the intensities to establish a change of scale. |
verbose |
level of debug information |
spec object
RWrapperCMD1D
belongs to the low-level functions group - it serves as a wrapper to
call internale functions for processing.
RWrapperCMD1D(cmdName, specMat, ...)
RWrapperCMD1D(cmdName, specMat, ...)
cmdName |
the name of internal function |
specMat |
a 'specMat' object |
... |
specific parameters of the requested function |
specMat
: a 'specMat' object
setLogFile
allows to redirect all log messages to a file
setLogFile(con = stdout())
setLogFile(con = stdout())
con |
a connection object which inherits from class "connection" |
# Redirect all log messages to a temporary file outtmp <- tempfile() con <- file(outtmp, "wt", encoding = "UTF-8") setLogFile(con) data_dir <- system.file("extra", package = "Rnmr1D") RAWDIR <- file.path(data_dir, "CD_BBI_16P02") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") SAMPLEFILE <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(RAWDIR, cmdfile=CMDFILE, samplefile=SAMPLEFILE, ncpu=2) close(con) readLines(outtmp)
# Redirect all log messages to a temporary file outtmp <- tempfile() con <- file(outtmp, "wt", encoding = "UTF-8") setLogFile(con) data_dir <- system.file("extra", package = "Rnmr1D") RAWDIR <- file.path(data_dir, "CD_BBI_16P02") CMDFILE <- file.path(data_dir, "NP_macro_cmd.txt") SAMPLEFILE <- file.path(data_dir, "Samples.txt") out <- Rnmr1D::doProcessing(RAWDIR, cmdfile=CMDFILE, samplefile=SAMPLEFILE, ncpu=2) close(con) readLines(outtmp)
Set the PPM bounds for proton (1H) and carbon (13C) to consider in the processing step and then to store in the specMat object
setPPMbounds(proton = c(-0.5, 11), carbon = c(0, 200))
setPPMbounds(proton = c(-0.5, 11), carbon = c(0, 200))
proton |
Minimal and Maximal ppm value for 1H NMR |
carbon |
Minimal and Maximal ppm value for 13C NMR |
Slice the spectrum in order to define ranges for Local Deconvolution (LSDeconv)
sliceSpectrum( spec, ppmrange = c(0.5, 9.5), flatwidth = 0.004, snrfactor = 4, maxrange = 0.3, excludezones = NULL )
sliceSpectrum( spec, ppmrange = c(0.5, 9.5), flatwidth = 0.004, snrfactor = 4, maxrange = 0.3, excludezones = NULL )
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
flatwidth |
specifies the minimum width of a zone in which the spectrum intensities are close to zero to consider this one as a cutting zone (default 0.003 ppm) |
snrfactor |
specifies factor applied on the Std Dev. of the noise used as threshold for first derivate intensities (default=4) |
maxrange |
specifies the maximum width of a cutting zone (default 0.3 ppm) |
excludezones |
specifies the exclusion zones as a matrix (Nx2), each row specifying a zone with 2 columns (ppm min and ppm max) (default NULL) |
a list of ppm range
Spec1rDoProc
belongs to the low-level functions group - it processes only one raw spectrum at time.
Spec1rDoProc(Input, param = Spec1rProcpar)
Spec1rDoProc(Input, param = Spec1rProcpar)
Input |
Full directory path of the raw spectrum |
param |
a Spec1rProcpar list |
spec object
Initialize Parameter Lists by the default ones
Spec1rProcpar
Spec1rProcpar
An object of class list
of length 41.
DEBUG
: Debug - defaut value = TRUE
LOGFILE
: Messages output file - default value = ""
VENDOR
: Instrumental origin of the raw data (bruker, varian, jeol, rs2d) - default value = 'bruker'
READ_RAW_ONLY
: Read raw file only; do not carry out processing; raw file is depending on INPUT_SIGNAL - default value = FALSE
INPUT_SIGNAL
: What type of input signal: 'fid' or '1r' - default value = 'fid'
PDATA_DIR
: subdirectory containing the 1r file (bruker's format only) - default value = 'pdata/1'
LB
: Exponantial Line Broadening parameter - default value = 0.3
GB
: Gaussian Line Broadening parameter - default value = 0
REMLFREQ
: Remove low frequencies by applying a polynomial subtraction method. - default order of the model = 0
REVPPM
: Reverse ppm scale - default value = FALSE
BLPHC
: Number of points for baseline smoothing during phasing - default value = 50
KSIG
: Number of times the noise signal to be considered during phasing - default value = 6
CPMG
: Indicate if CPMG sequence - default value = FALSE
ZEROFILLING
: Zero Filling - - default value = FALSE
ZFFAC
: Max factor for Zero Filling - default value = 4
LINEBROADENING
: Line Broading - default value = TRUE
TSP
: PPM referencing - default value = FALSE
RABOT
: Zeroing of Negative Values - default value = FALSE
OPTPHC0
: Zero order phase optimization - default value = TRUE
OPTPHC1
: First order phase optimization - default value = FALSE
OPTCRIT1
: Criterium for phasing optimization (1 for SSpos, 2 for SSneg, 3 for Entropy - default value = 2
JGD_INNER
: JEOL : internal (or external) estimation for Group Delay - default value = TRUE
specDeconv
= peakFinder
+ peakOptimize
. specDeconv
belongs to the low-level functions group for deconvolution.
specDeconv(spec, ppmrange, params = NULL, filter = "none", verbose = 1)
specDeconv(spec, ppmrange, params = NULL, filter = "none", verbose = 1)
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
params |
a list of specific parameters for deconvolution |
filter |
a filter type for filtering the noise and smoothing the signal |
verbose |
level of debug information |
a list
specModel
belongs to the low-level functions group for deconvolution.
specModel(spec, ppmrange, peaks)
specModel(spec, ppmrange, peaks)
spec |
a 'spec' object |
ppmrange |
a ppm range as a list in order to apply the deconvolution |
peaks |
a matrix defining peaks, one peak by row, with columns defined as : pos, ppm, amp, sigma, eta, integral |
a vector with the same size of the spectrum