I agree with Tomasz that the approach you are describing falls within the field of MORL. For a solid introduction MORL I would recommend the survey by Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67-113.
https://www.jair.org/index.php/jair/article/view/10836 (disclaimer: I'm an author in this, but I genuinely believe it will be useful to you).
Our survey provides arguments for the need for multiobjective methods by describing three scenarios where agents using single-objective RL may be unable to provide a satisfactory solution which matches the needs of the user. Briefly these are (a) the unknown weights scenario where the required trade-off between the objectives isn't known in advance, and so to be effective the agent must learn multiple policies corresponding to different trade-offs and then at run-time select the one which matches the current preferences (eg this can arise when the objectives correspond to different costs which vary in relative price over time; (b) the decision support scenario where scalarization of a reward vector is not viable (for example, in the case of subjective preferences which defy explicit quantification), so the agent needs to learn a set of policies, and then present these to a user who will select their preferred option, and (c) the known weights scenario where the desired trade-off between objectives is known but its nature is such that the returns are non-additive (ie if the user's utility function is non-linear) and therefore standard single-objective methods based on the Bellman equation can't be directly applied.
We propose a taxonomy of MORL problems in terms of the number of policies they require (single or multi-policy), the form of utility/scalarization function supported (linear or non-linear), and whether deterministic or stochastic policies are allowed, and relate this to the nature of the set of solutions which the MO algorithm needs to output. This taxonomy is then used to categorise existing MO planning and MORL methods.
One final important contribution is identifying the distinction between maximising Expected Scalarised Return (ESR) or Scalarised Expected Return (SER). The former is appropriate in cases where we are concerned about the results within each individual episode (for example, when treating a patient - that patient will only care about their own individual experience), while SER is appropriate if we care about the average return over multiple episodes. This has turned out to be a much more important issue than I anticipated at the time of the survey, and Diederik Roijers and his colleagues have examined it more closely since then (eg http://roijers.info/pub/esr_paper.pdf)