Detecting the Chirality of a Protein by a Neural Network

Chirality in Protein Folding

In Protein Folding, each protein has a mirrored formation. It is still a problem to say, if the protein was folded in the right direction.

In this article we describe a way, how to detect the chirality by a Neural Network with Tensorflow.

Our Method

The entire process is fully described here for anyone to copy and improve on. This process is a completely in-house development by Kroll-Software and is not based on any other ideas. Please quote us if you use this method.

The procedure makes a prediction about the chirality of a protein.

Motivation

When folding a protein by an algorithm, it will often be folded in the wrong direction. So it is important to detect it's chirality and mirror it, when needed.

Data

The training data used come from the publicly accessible Protein Database (PDB).

Implementation

The neural network uses Keras / Tensorflow (or another backend). We program in C# and use a modified version of Keras.NET. All code examples are therefore in C#, but can easily be adapted to other languages.

Network Architecture / Topology

chirality-model

BaseModel BuildModel()
{
    bool useBias = false;
    string initializer = "random_uniform";

    Sequential model = new Sequential();
    model.Add(new Dense(units: 720, input_dim: 720));

    model.Add(new Dense(units: 720, activation: "tanh", use_bias: useBias, kernel_initializer: initializer));
    model.Add(new Dense(units: 720, activation: "tanh", use_bias: useBias, kernel_initializer: initializer));

    model.Add(new LeakyReLU());
    model.Add(new Dense(units: 540, use_bias: useBias, kernel_initializer: initializer));

    model.Add(new LeakyReLU());
    model.Add(new Dense(units: 540, use_bias: useBias, kernel_initializer: initializer));

    model.Add(new LeakyReLU());
    model.Add(new Dense(units: 320, use_bias: useBias, kernel_initializer: initializer));

    model.Add(new Dense(units: 2, activation: "softmax"));
    
    var optimizer = new Adam(lr: 0.25e-5f);
    model.Compile(optimizer: optimizer,
        loss: "categorical_crossentropy",
        metrics: new string[] { "accuracy" });

    model.Summary();
    model.Plot("chirality-model.png");

    return model;
}

Input Values

The model receives the Phi and Psi angles of the protein's alpha chain in a vectorized array.

Vectorization Function


float[] Vectorize(double[] Phi, double[] Psi, bool randomize = false)
{
    float[] arr = new float[720];

    foreach (double d in Phi)
    {
        double degree = d.ToDegree();
        int idx = (int)(degree + 180.5);

        while (idx < 0)
            idx += 360;
        while (idx >= 360)
            idx -= 360;

        if (randomize)
            RandomizeIndex(idx);

        arr[idx] += 1;
    }

    foreach (double d in Psi)
    {
        double degree = d.ToDegree();
        int idx = (int)(degree + 180.5);

        while (idx < 0)
            idx += 360;
        while (idx >= 360)
            idx -= 360;

        if (randomize)
            RandomizeIndex(idx);

        arr[idx + 360] += 1;
    }

    if (randomize)
        AddNoise(arr, Phi.Length / 10);

    // Normalize
    float max = arr.Max();
    for (int j = 0; j < 720; j++)
        arr[j] /= max;

    return arr;
}

Randomization

We randomize the input vector for more robustness.


int RandomizeIndex(int idx)
{
    const int range = 45;
    int r = ThreadSafeRandom.Next(range);
    if (ThreadSafeRandom.NextDouble() < 0.5)
        r = -r;

    int ret = idx + r;
    while (ret < 0)
        ret += 360;
    while (ret >= 360)
        ret -= 360;

    return ret;
}

Adding Noise

We also add some noise.


float[] AddNoise(float[] vec, int count)
{
    int half = vec.Length / 2;

    for (int i = 0; i < count; i++)
    {
        int idx = ThreadSafeRandom.Next(360);
        vec[idx] += 1;
    }

    for (int i = 0; i < count; i++)
    {
        int idx = ThreadSafeRandom.Next(360);
        vec[idx + half] += 1;
    }

    return vec;
}

Training

This is our training function in Keras.NET / C#


public double[] BatchTrain(List data, bool evaluate = false, bool randomize = false)
{
    int batch_size = 32;
    int batchStart = 0;

    if (evaluate)
        batch_size = data.Count;

    List retLoss = new List();
    List retErr = new List();

    MRE.Reset();
    var ts = PythonEngine.BeginAllowThreads();

    try
    {
        while (batchStart < data.Count)
        {
            int count = Math.Min(batch_size, data.Count - batchStart);

            float[,] chirlist = new float[count, 720];
            float[,] yval = new float[count, 2];

            for (int j = 0; j < count; j++)
            {
                ChiralityData chir = data[j + batchStart];

                float[] vec = Vectorize(chir.Phi, chir.Psi, randomize);
                for (int k = 0; k < 720; k++)
                {
                    chirlist[j, k] = vec[k];
                }

                yval[j, 0] = chir.Mirrored ? 0 : 1;
                yval[j, 1] = chir.Mirrored ? 1 : 0;
            }

            using (Py.GIL())
            {
                var train = new NDarray(chirlist);
                var y = new NDarray(yval);

                double[] retVal = null;

                if (evaluate)                        
                    retVal = Model.Evaluate(train, y, verbose: 0);                        
                else                        
                    retVal = Model.TrainOnBatch(train, y);

                retLoss.Add(retVal[0]);
                retErr.Add(retVal[1]);
            }

            batchStart += batch_size;
        }
    }
    catch (Exception ex)
    {
        ex.LogError();
        HasErrors = true;
    }
    finally
    {
        PythonEngine.EndAllowThreads(ts);
        MRE.Set();
    }

    double lossVal = retLoss.Sum() / retLoss.Count;
    double errVal = retErr.Sum() / retErr.Count;

    return new double[] { lossVal, errVal };
}

Credits

Citation needed.

Support this Research

If you enjoy this research, we look forward to your support. Please contact us.