Cornell Virtual Workshop > Checkpointing > C/R Background

Ad Hoc Examples

Before we look at a real-world example, let's look at the structure of a couple of programs that have implemented ad-hoc or application-level C/R using pseudocode. In both examples, we use functions save and load to save and load a single value, respectively (this value could be of any type, like a matrix of decimal numbers). In the first example, we see a program that performs a couple of long operations, where each operation depends on the next. As a tradeoff between the complexity of implementation and the granularity of save points, only the final results of each large operation are saved. It is assumed the variable is saved to a file of the same name with the ".save" extension.

if (not exists intermediateResult) {
    if exists File("intermediateResult.save") {
        intermediateResult = load("intermediateResult.save")
    } else {
        intermediateResult = longComputation(inputData)
        save(intermediateResult)
    }
}

finalResult = computeFinalResult(intermediateResult)
save(finalResult)

In the next example, data is saved incrementally during a loop. One apparent problem is that save function, as called, will likely save the entire outputData vector each iteration — additional care should be taken when using or implementing a save procedure to avoid extreme and unnecessary overhead.

startIndex = 0

if (not exists outputData) {
    if exists File("outputData.save") {
        outputData = load("outputData.save")
    }
}

if (exists outputData) {
    startIndex = first ii such that outputData(ii) is uninitialized
}

for (ii=startIndex; ii < 10000; ii++) {
    outputData(ii) = runComputation(inputData(ii))
    save(outputData)
}

In the following MATLAB example, we show some excerpts of code from a research project that used ad-hoc C/R. It is not necessary to read the code in detail; it is merely showing what an undesirable example may look like. In relationship to the prior pseudocode listings, it more closely resembles the first example. The relevant parts are highlighted in red; the save procedures use MATLAB's save function. The first two calls to save are meant to store incrementally generated data, whereas the last two calls to save merely store the final results of the computation performed by the script.

function epiData = variedUptakeEpiSim(                         ...
    model, method, savename, rxnid, A, fredux, WTFlux, grData  ...
)

%%% initialization code removed from snippet %%%

% optional "checkpoint" arg WTFlux was not provided, so recompute
if nargin < 7
  WTFlux = zeros(lA,nrxn);
  parfor i=1:lA
    % ...
    disp(strcat('Finished geometric WT sim ',num2str(i)));
  end
  potentialFail = abs(WTFlux(:,[536,1577])) < 1e-7;
  pFsum = sum(potentialFail(:,1) & potentialFail(:,2));
save(strcat(savename,'_Flux.mat'),'WTFlux');
  if pFsum > 0
    error(strcat(num2str(pFsum)),' unlikely solutions encountered from geoFBA')
  end
end

% optional "checkpoint" arg grData was not provided, so recompute
if nargin < 8
  grData = zeros(lA,ngen,ngen);
  disp(lA)
  for i=1:lA
    mtmp = model;
    % ...
    dtmp = doubleGeneMutationIsoAvg(mtmp, method, [fredux], [fredux], ...
                                    squeeze(WTFlux(i,:)), dlvl);
    grData(i,:,:) = dtmp;
    disp(i);
  end
save(strcat(savename,'_gr.mat'),'grData');
end

disp('Starting Epistasis  Loop');
for i=1:lA
  % ...
  epiData(i,:,:) = squeeze(eEffect(:,:,cellij,cellij));
  dlmwrite(strcat(savename,num2str(A(i))), squeeze(grRateKOTens(:,:,cellij,cellij)));
  dlmwrite(strcat('epi',savename,num2str(A(i))), squeeze(epiData(i,:,:)));
end

save(strcat(savename,'_epi.mat'),'epiData');
uptakeFlux=A;
save(strcat(savename,'_rxn',num2str(rxnid),'.mat'),'uptakeFlux');

Note that something as simple as a typo in one of the first two save statements would not only cause a run-time error, it would prevent the save statement from being executed itself, resulting in a possibly severe loss of computation time, when this is precisely what you wanted to avoid. In order to avoid this situation, you will want to test your code appropriately at a small scale before running at a large scale.

Where is the restore procedure? In this case, there isn't any, since it is assumed the data variables (xWTFlux and grData) are optional function arguments and must be loaded separately, as needed. However, adding optional function arguments isn't enough: we must still check for these arguments by using nargin checks in conditionals to make sure we don't regenerate data unnecessarily. The fact that nargin depends on the order of function arguments gives rise to inflexible and error-prone code. Instead, consider using varargin or the inputParser class as better alternatives to nargin, or similar approaches in languages other than MATLAB.

Back