Skip to content Skip to sidebar Skip to footer

How Can I Implement This Model?

Problem statement I have 3 classes (A, B, and C). I have 6 features: train_x = [[ 6.442 6.338 7.027 8.789 10.009 12.566] [ 6.338 7.027 5.338 10.009 8.122 11.217]

Solution 1:

The problem is with how keras calculate the accuracy. For example, in the code below

y_true = np.array([[1,0,0,0,1,0,0,0,1]]) 
y_pred = np.array([[.8,.1,.1,1,10,2,2,3,5.5]]) 

metric = tf.keras.metrics.Accuracy()
metric.update_state(y_true,y_pred)
metric.result().numpy()

The calculated accuracy is zero, however, by comparing

  1. [.8,.1,.1] with [1,0,0]
  2. [1,10,2] with [0,1,0]
  3. [2,3,5.5] with [0,0,1]

we know the y_pred is actually very accurate, and this might be the reason why your model just does not work. In order to handle this problem under the current model, applying sigmoid activation in the output layer might help, you can check this by running the following code

import numpy as np
import tensorflow as tf 
import keras
from sklearn.preprocessing import MinMaxScaler


defdataset_gen(num_samples):
    # each data row consists of six floats, which is the feature vector of a 5-character # string pattern comprising of 3-classes(e.g. AABBC, etc.)# in order to represent this 5-character string, a sequentially ordered one-hot encoding vector is used 
    np.random.seed(0)
    output_classes = np.random.randint(0,3,size=(num_samples,5))
    transform_mat = np.arange(-15,15).reshape(5,6) + .1*np.random.rand(5,6)
    print(transform_mat)
    feature_vec = output_classes @ transform_mat
    output_classes += np.array([0,3,6,9,12])
    # convert output_classes to one-hot encoding 
    output_vec = np.zeros((num_samples,15))
    for ind,item inenumerate(output_classes):
        output_vec[ind][item] = 1.return feature_vec,output_vec


defcreate_model():
    # a simple sequential model
    n_hidden,num_features,num_outputs = 16,6,15
    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=(num_features,)))
    model.add(tf.keras.layers.Dense(n_hidden,activation="relu"))
    model.add(tf.keras.layers.Dense(num_outputs,activation="sigmoid"))
    return model

defloss(y_true, y_pred):
    l1 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, :3], y_pred[:, :3])
    l2 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 3:6], y_pred[:, 3:6])
    l3 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 6:9], y_pred[:, 6:9])
    l4 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 9:12], y_pred[:, 9:12])
    l5 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 12:], y_pred[:, 12:])
    
    return l1 + l2 + l3 + l4 + l5

# create Stochastic Gradient Descent optimizer for the NN model# opt_function = keras.optimizers.Adam(learning_rate=.1)# create a sequential NN model
model = create_model()
model.compile(optimizer='adam', loss=loss, metrics=['accuracy'])

es = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy',mode='max',verbose=1,patience=100)
history = model.fit(test_x,test_z,epochs=2000,batch_size=8,
                    callbacks=es,validation_split=0.2,
                    verbose=0)

Solution 2:

Sequential is used when you have a single network input and output. In the current setup you have multiple output layers to take into consideration consecutive groups of 3 output values are linked. This can be enforced through the loss function as well.

import numpy as np
import tensorflow as tf

# random input data with 6 features
inp = tf.random.uniform(shape=(1000, 6))

# output data taking into consideration that 3 consecutive bits are one class.
out1 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out2 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out3 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out4 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out5 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)

out = tf.concat([out1, out2, out3, out4, out5], axis=1)

# a simple sequential model 
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(6,)))
model.add(tf.keras.layers.Dense(20, activation="relu"))
model.add(tf.keras.layers.Dense(20, activation="relu"))
model.add(tf.keras.layers.Dense(15))


# custom loss to take into the dependency between the 3 bits

def loss(y_true, y_pred):
    l1 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, :3], y_pred[:, :3])
    l2 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 3:6], y_pred[:, 3:6])
    l3 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 6:9], y_pred[:, 6:9])
    l4 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 9:12], y_pred[:, 9:12])
    l5 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 12:], y_pred[:, 12:])
    
    return l1 + l2 + l3 + l4 + l5

opt_function = tf.keras.optimizers.SGD()

model.compile(optimizer=opt_function, loss=loss)
model.fit(inp, out, batch_size=10)

The same idea needs to be used when evaluating the network as well. You need to take argmax over 3 bits separately (5 times) so that you get a sequence of 5 classes as output.

Solution 3:

I think this is where the problem arises.

 model.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
...
loss=['categorical_crossentropy'] * 5

>>>Shapes (10, 3) and (10, 15) are incompatible

You don't really want to mess with your loss function like that. Try to fix your output. Models created with Sequential API are the simpler ones that have a single/output. If you want to change a Functional API model in a simpler layout you should merge the inputs/outputs in a single input/output. Which means that you should merge the labels also after one-hot encoding.

WARNING:tensorflow:AutoGraph could not transform <function loss at 0x000001F571B4F820> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

This warning won't make your model not to train, so you can ignore it. If it doesn't train, then you should probably start tweaking hyperparameters!

Solution 4:

Before I mention my solution I will warn you that it's not correct as the methodology is wrong but it might work if you have a very large dataset. What you want to do is to use consider a set of 3 values as a multi-class problem and the characters as a multi-label problem which is not possible. You can't divide your problem like this for sequential models But if you have a large dataset then you can consider it as a multi-label problem as a whole in which case there will be cases when you get 2 active labels any of the 3 sets and you have to apply post-processing in some manner. Say - set that label active which has the highest sigmoid value individually.

Post a Comment for "How Can I Implement This Model?"