This is a difficult question and very few students ever solve it
completely without help. It's better to consider

w = u-v

so that w vanishes on the boundary and w also satisfies the diffusion
equation.

I will use S instead of phi, so

S(t) = int_D w(x,y,t)^2 dx dy

where D is the square [0,1] x [0,1]. You should be able to show that

S'(t) = - 2 int_D ||grad w||^2 dx dy

which is the first hint. Thus S(t) is non-negative and
its derivative is nonpositive, which implies it's a non-negative
decreasing function. This implies that lim S(t) exists and is
non-negative, but does not show that it's zero, which needs more work.

There are two possible routes for the final step. One is to argue
S'(t) tends to zero, as t tends to infinity, which implies that ||grad
w(x,y,t)|| tends to zero as t tends to infinity at every point (x,y)
in the square D (making that argument fully rigorous requires measure
theory, so you're not expected to worry too much). Now we can
integrate grad w(x,y,t) along the horizontal line from (0,Y) to (X,Y),
so lim ||grad w(x,y,t)|| = 0 implies that lim w(X,Y,t) = 0 (remember
that w(0,Y) = 0). Hence S(t) tends to zero.

The second possibility (and I prefer this route) is to use a Fourier
sine series, so that

w(x,y,t) = sum_j sum_k b_jk(t) sin(pi j x) sin(pi k y).

Since w satisfies the diffusion equation, we deduce that the Fourier
coefficients satisfy

b_jk(t) = b_jk(0) exp(-pi^2 (j^2 + k^2)t),

so they're tending to zero exponentially quickly as t tends to
infinity. You can then use the Parseval theorem, which states that

S(t) = sum_j sum_k b_jk(t)^2.