### Definition

In premDAT, the similarity is a measure of how close two characters are. It ranges from 1 to 100 and currently relies on two attributes of the data:

**The dimensions****The tags**

Each pair of characters is compared through each dimension. Similar score (or level) in a dimension brings more points, but the **importance** of that dimension varies. When a dimension reach 1 or 5 for at least one character, it is considered important - a defining attribute of that character. If both scores match for a defining attribute, the similarity is level will likely be high. Conversely, high differences on defining attributes will severely decrease similarity. Matching tags also bring a few more points to the final computation, while tags present for a character and not for the other slightly diminishes the final similarity score.

### Concrete example (and values)

Consider the following abstract example with two characters (A and B), three tags (I, II, III) and five dimensions.

A has the following dimensions:

- Dimension 1 : 4
- Dimension 2 : 2
- Dimension 3 : 3
- Dimension 4 : 5
- Dimension 5 : 1
- Tags: I

B has the following dimensions:

- Dimension 1 : 1
- Dimension 2 : 2
- Dimension 3 : 5
- Dimension 4 : 4
- Dimension 5 : 3
- Tags: I, III

Here is how the similarity is computed:

**Dimension 1**: the importance of the dimension is 10, because B has a score of 1. The similarity score for the dimension is 4 -`Math.abs(Dimension 1 for A - Dimension 1 for B)`

, which results in 1. Thus, the dimension's score is 10 for similarity out of a total of 40 (`10/40`

).**Dimension 2**: importance of 5, similarity of 4. Score is`20/20`

.**Dimension 3**: importance of 10, similarity of 2. Score is`20/40`

.**Dimension 4**: importance of 10, similarity of 3. Score is`30/40`

.**Dimension 5**: importance of 10, similarity of 2. Score is`20/40`

.**Tags**: each tag adds 3 points to the total. Tags that match between two characters add 3 points to the similarity score. A and B both have tag I, but only B has tag III. Thus, the total is increased of 6 points while the similarity score is increased of 3 (`3/6`

).

The final result is computed by `similarity / total * 100`

. Thus, the similarity between A and B is `93 / 146 * 100 ~= 69%`

.