learning_rate 0.583 activation relu activation2 selu activation3 LeakyRelu batch_size 40 embed_dim 28 epochs 20 num_heads 14 ff_dim 14 neurons 38 neurons2 12 dropout_rate 0.16 dropout_rate2 0.17 Optimizer Adadelta