0

The goal of this program is to predict a game outcome given a game-reference-id, which is a serial number like so:

id,totalGreen,totalBlue,totalRed,totalYellow,sumNumberOnGreen,sumNumberOnBlue,sumNumberOnRed,sumNumberOnYellow,gameReferenceId,createdAt,updatedAt 1,1,3,2,0,33,27,41,0,1963886,2020-08-07 20:27:49,2020-08-07 20:27:49 2,1,4,1,0,36,110,31,0,1963887,2020-08-07 20:28:37,2020-08-07 20:28:37 3,1,3,2,0,6,33,83,0,1963888,2020-08-07 20:29:27,2020-08-07 20:29:27 4,2,2,2,0,45,58,44,0,1963889,2020-08-07 20:30:17,2020-08-07 20:30:17 5,0,2,4,0,0,55,82,0,1963890,2020-08-07 20:31:07,2020-08-07 20:31:07 6,2,4,0,0,36,116,0,0,1963891,2020-08-07 20:31:57,2020-08-07 20:31:57 7,3,2,1,0,93,16,40,0,1963892,2020-08-07 20:32:47,2020-08-07 20:32:47

Here's the link for a full training dataset.

After training the model, it becomes difficult to use the model to predict the game output, since the game-reference-id is the only independent column, while others are random.

Is there a way to flip the features with the labels during prediction?

nbro
  • 39,006
  • 12
  • 98
  • 176
Tetranyble
  • 11
  • 2
  • Your question was indeed a little bit unclear. It seems that you wanted to predict the independent features rather than the dependent ones, but it's not fully clear why you want to do that, if your initial goal is to predict the game output (dependent feature). I tried to clarify your post, but I am not sure if it changed the meaning or not. Please, read your post again and try to clarify what you're really asking, after my feedback. – nbro Dec 20 '20 at 12:49

1 Answers1

1

The body of your post seems to be asking a completely separate question than the title of your post, so I will answer both:

"Body: How do I complete the goal of this program?"

Your dataset does not have the dependent variable, which is the outcome of the game (win/loss/draw). What I am assuming is that you have a way of looking up the outcome of the game from either the "id" field or the "gameReferenceId" field.

So you would have to augment the dataset with a new column, "gameOutcome", which has values (win/loss/draw), by looking up the outcome of each game (each row) and adding that to your dataset.

Once you have this, you have the 12 independent variables (the 12 columns already there), and the 1 dependent variable (the "gameOutcome"), and the prediction task should be straightforward from there.

"Title: Given a label, how do I predict features?"

(note: this section will not help your program)

What you are looking for are generative models. Generative models can generate instances given the label and/or some random seed. This is a completely different model than discriminative models, which given the instance predicts the label (this is the one you have).

The simplest generative model is a Naive Bayes model. While Naive Bayes is normally used as a discriminative model (to classify a label), it has enough information to generate instances given a label as well. Here are some tips on how to turn Naive Bayes into a generative model.

If you are looking for generative deep neural network models, this blog by OpenAI has a nice overview explaining the three modern approaches (generative adversarial networks, variational autoencoders, and autoregressive models). The blog explains at a high level what the strengths and weaknesses are of each approach, and links to some open source projects for you to try it out as well.

user3667125
  • 1,500
  • 5
  • 13
  • thank you so much for explaining indepth. – Tetranyble Dec 20 '20 at 12:50
  • When a post is unclear or there are inconsistencies between the title's question and the body's question, the best thing to do is to ask for clarification under the post/question before providing an answer (and, meanwhile, flag the post to be closed as unclear to avoid getting all kinds of answers that differ because of different interpretations of the post/question). – nbro Dec 20 '20 at 12:53
  • Got it, makes sense! – user3667125 Dec 20 '20 at 20:50