How Can I Implement This Model?
Solution 1:
The problem is with how keras
calculate the accuracy. For example, in the code below
y_true = np.array([[1,0,0,0,1,0,0,0,1]])
y_pred = np.array([[.8,.1,.1,1,10,2,2,3,5.5]])
metric = tf.keras.metrics.Accuracy()
metric.update_state(y_true,y_pred)
metric.result().numpy()
The calculated accuracy is zero, however, by comparing
[.8,.1,.1]
with[1,0,0]
[1,10,2]
with[0,1,0]
[2,3,5.5]
with[0,0,1]
we know the y_pred
is actually very accurate, and this might be the reason why your model just does not work. In order to handle this problem under the current model, applying sigmoid activation in the output layer might help, you can check this by running the following code
import numpy as np
import tensorflow as tf
import keras
from sklearn.preprocessing import MinMaxScaler
defdataset_gen(num_samples):
# each data row consists of six floats, which is the feature vector of a 5-character # string pattern comprising of 3-classes(e.g. AABBC, etc.)# in order to represent this 5-character string, a sequentially ordered one-hot encoding vector is used
np.random.seed(0)
output_classes = np.random.randint(0,3,size=(num_samples,5))
transform_mat = np.arange(-15,15).reshape(5,6) + .1*np.random.rand(5,6)
print(transform_mat)
feature_vec = output_classes @ transform_mat
output_classes += np.array([0,3,6,9,12])
# convert output_classes to one-hot encoding
output_vec = np.zeros((num_samples,15))
for ind,item inenumerate(output_classes):
output_vec[ind][item] = 1.return feature_vec,output_vec
defcreate_model():
# a simple sequential model
n_hidden,num_features,num_outputs = 16,6,15
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(num_features,)))
model.add(tf.keras.layers.Dense(n_hidden,activation="relu"))
model.add(tf.keras.layers.Dense(num_outputs,activation="sigmoid"))
return model
defloss(y_true, y_pred):
l1 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, :3], y_pred[:, :3])
l2 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 3:6], y_pred[:, 3:6])
l3 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 6:9], y_pred[:, 6:9])
l4 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 9:12], y_pred[:, 9:12])
l5 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 12:], y_pred[:, 12:])
return l1 + l2 + l3 + l4 + l5
# create Stochastic Gradient Descent optimizer for the NN model# opt_function = keras.optimizers.Adam(learning_rate=.1)# create a sequential NN model
model = create_model()
model.compile(optimizer='adam', loss=loss, metrics=['accuracy'])
es = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy',mode='max',verbose=1,patience=100)
history = model.fit(test_x,test_z,epochs=2000,batch_size=8,
callbacks=es,validation_split=0.2,
verbose=0)
Solution 2:
Sequential is used when you have a single network input and output. In the current setup you have multiple output layers to take into consideration consecutive groups of 3 output values are linked. This can be enforced through the loss function as well.
import numpy as np
import tensorflow as tf
# random input data with 6 features
inp = tf.random.uniform(shape=(1000, 6))
# output data taking into consideration that 3 consecutive bits are one class.
out1 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out2 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out3 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out4 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out5 = tf.one_hot(tf.random.uniform(shape=(1000,), dtype=tf.int32, maxval=3), depth=3)
out = tf.concat([out1, out2, out3, out4, out5], axis=1)
# a simple sequential model
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(6,)))
model.add(tf.keras.layers.Dense(20, activation="relu"))
model.add(tf.keras.layers.Dense(20, activation="relu"))
model.add(tf.keras.layers.Dense(15))
# custom loss to take into the dependency between the 3 bits
def loss(y_true, y_pred):
l1 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, :3], y_pred[:, :3])
l2 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 3:6], y_pred[:, 3:6])
l3 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 6:9], y_pred[:, 6:9])
l4 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 9:12], y_pred[:, 9:12])
l5 = tf.nn.softmax_cross_entropy_with_logits(y_true[:, 12:], y_pred[:, 12:])
return l1 + l2 + l3 + l4 + l5
opt_function = tf.keras.optimizers.SGD()
model.compile(optimizer=opt_function, loss=loss)
model.fit(inp, out, batch_size=10)
The same idea needs to be used when evaluating the network as well. You need to take argmax over 3 bits separately (5 times) so that you get a sequence of 5 classes as output.
Solution 3:
I think this is where the problem arises.
model.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
...
loss=['categorical_crossentropy'] * 5
>>>Shapes (10, 3) and (10, 15) are incompatible
You don't really want to mess with your loss function like that. Try to fix your output. Models created with Sequential API are the simpler ones that have a single/output. If you want to change a Functional API model in a simpler layout you should merge the inputs/outputs in a single input/output. Which means that you should merge the labels also after one-hot encoding.
WARNING:tensorflow:AutoGraph could not transform <function loss at 0x000001F571B4F820> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux,
export AUTOGRAPH_VERBOSITY=10
) and attach the full output. Cause: module 'gast' has no attribute 'Index' To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
This warning won't make your model not to train, so you can ignore it. If it doesn't train, then you should probably start tweaking hyperparameters!
Solution 4:
Before I mention my solution I will warn you that it's not correct as the methodology is wrong but it might work if you have a very large dataset. What you want to do is to use consider a set of 3 values as a multi-class
problem and the characters as a multi-label
problem which is not possible. You can't divide your problem like this for sequential models But if you have a large dataset then you can consider it as a multi-label
problem as a whole in which case there will be cases when you get 2 active labels any of the 3 sets and you have to apply post-processing in some manner. Say - set that label active which has the highest sigmoid value individually.
Post a Comment for "How Can I Implement This Model?"