2

I want to use LLMs to predict edge weights in a graph based on attributes between two nodes. Is this even possible? If not, what would you recommend?

I tried to look up uses of LLM in regression tasks, but haven't had much luck finding anything helpful.

nbro
  • 39,006
  • 12
  • 98
  • 176
  • A better question might be, why? are other well-understood approaches not sufficient? Large _language_ models don't seem like a good place to start, unless somehow the weights will depend on understanding text. – Sean Owen May 27 '23 at 16:29

1 Answers1

3

Regression with LLMs is definitely possible. Assuming you use a GPT-like model, you can either

  1. train the transformer from scratch on the regression task, or
  2. first pre-train the transformer on a general task, and then transfer learn the regression task by replacing the final linear layer.

Which option is more appropriate depends on the kind of regression task and how well that task is generalizable. For example, if you want to assess how positive a text is, you can better do that using option 2. However, a super-specific regression task is likely easier to do through option 1.

Check out this post for more general knowledge on transformers, and a bit of context about 'pretraining' and 'transfer learning'.

Robin van Hoorn
  • 1,810
  • 7
  • 32
  • Thanks! Does the answer change if the data is mostly numerical and categorical data where not much text is involved? – sharkeater123 May 18 '23 at 03:02
  • You can still use transformers for numerical data. Neural nets have never been veey good at categorical data, but it is possible. The approach is still the same. If its generalizable then use 1 otherwise 2 – Robin van Hoorn May 18 '23 at 06:00
  • Would you recommend LLMs for tabular data in general? I feel like there are easier ways to accomplish the task at hand. The post was helpful, are there any other resources you can share? Thanks! – sharkeater123 May 18 '23 at 14:05
  • Transformers are very general architectures. Although they were introduced for NLP they can be used for static data as well (tho arguably you wouldnt need positional encoding then). However, if you have static data, simple MLP architectures can get you started faster probably. For graphs, which you mention, you might actually wanna look into graph-neural-networks. They are super interesting and might be more suited for your problem. – Robin van Hoorn May 18 '23 at 14:55
  • Awesome, thanks for answering. I was told to use GPT-4 to work on predicting edge weights, but I can't imagine it being super helpful which is why I asked. To be clear, applying GPT to static data prediction would take a lot of setup right? – sharkeater123 May 18 '23 at 16:00
  • It is possible, but for graphs its simply not the right architecture – Robin van Hoorn May 19 '23 at 08:10
  • The overall graph structure doesn't mean much. It's more so about the attributes associated with two nodes and the edge. It's essentially just predicting a price based on numerical and categorical features. GPT is not the right architecture for this? – sharkeater123 May 19 '23 at 14:50
  • Theres many architectures that can be used for such a task. If you'd like specific advice on how to approach a problem, i'd recommend starting a new question, detailing your dataset/use-case/target and asking what kind of architecture would be most suitable. This question was opened for whether transformers can be used for regression tasks, not for advice on a specific problem (although that was the intention behind it). – Robin van Hoorn May 19 '23 at 15:27