In general, I must note that the style of intro probability texts has changed a lot over the years. These days, much less emphasis on combinatorics, not as much formal math (no remainders in Taylor, if it is mentioned at all), more regard for the idea that an algorithm can be better than a messy closed form solution.
There are many books of the form "Probability and Statistics ...", in which the first part is devoted to probability. A book like Rice's "Mathematical Statistics and Data Analysis" comes to mind. Typically the treatment of probability is rather brisk, and maybe not quite what students can handle the first time around. We often use Rice for Stat 242.
Books like Olkin, Gleser, and Derman "Probability Models and Applications" cover similar material to 241, but the treatment of conditioning is too plodding. The conditional probability P(A|B) appears first in Section 3.3, introduced as a ratio; in 241, conditional probabilities appear in lecture 2, treated as a basic concept. They introduce conditional distributions and conditional expectations in Section 9.3; 241 Lecture 4 introduced conditional expectations. Their Chapter 12 contains the formal (classical) theory for Markov chains in discrete time; Lecture 3 in 241 treated a Markov chain problem.
I once used Chung "Elementary Probability Theory with Stochastic Processes" for 241, but I don't think the students liked it. It had some elegance.
I got mixed reviews when I used the Pitman text. Some students loved it; others found it too mathematical. I have adapted some topics from the book.
I feel that I must provide very detailed notes precisely because my approach to probability is different from most texts. Ultimately we all end up doing the same mathematical calculations, but the connection with reality differs in important ways. I learnt about the virtues of emphasizing conditioning from some old notes written by Terry Speed.