Many Elements Come Together to Cut DNA
The point of interest in a CRISPR event is the Cas9-sgRNA-DNA complex, or ribonucleoprotein. It is made up of the nuclease enzyme Cas9, which is bound to a guide RNA molecule (gRNA). The guide RNA has a specific sequence that directs the Cas9 nuclease to a corresponding DNA target in the genome. However, the Cas9-gRNA can only bind to a point on the genome that has a protospacer-adjacent motif (PAM) site, which is simply a specific sequence, such as "NGG".
When the Cas9-gRNA complex finds a PAM site on one of the strands of DNA known as the non-target strand, the enzyme unwinds the double-helix structure of the genomic DNA. Following this, the Cas9 enzyme attempts to intertwine the gRNA with the other strand, known as the target stand. If there is a match between the gRNA and the genomic target strand, then the enzyme activates two separate nuclease domains, which cut both the non-target and the target stand, resulting in a double strand break (DBS).
Broken DNA is Repaired
When the bacterial immune system targets invading viral or foreign DNA, the double strand break to the pathogenic DNA usually protects the bacteria from any harm - once broken, the invading DNA is rendered useless. When the double strand break is introduced to a segment of a cell's DNA in genome editing, that cell quickly recognises that a segment of its genome has been cut, and will attempt to initiate a repair.
Organisms vary in their preferred repair pathways, but most are able to use the Non-Homologous End Joining (NHEJ) pathway, which reliably repairs the double strand break by effectively "making up" a repair. The creation of a new section of DNA invariably sees a large number of errors being introduced at that site, and will typically disrupts the correct function of the current gene.
Alternatively, where a "donor" strand of DNA, that is sufficiently similar to the original section of DNA, is available, the donor can be integrated via the Homology-Directed-Repair (HDR) pathway, though this is not a reliable event and is very rare in CRISPR events other than in genome editing.
The different CRISPR repair mechanisms and their use in genome editing is covered here:
How CRISPR Immunity Systems Differ from CRISPR Genome Editing Systems
There are three primary types of CRISPR system found in bacteria in archaea: type I, type II and type III. Type II CRISPR systems are the most well understood, and the term "CRISPR" typically refers to these systems and their variants.
In the section above, we made reference to "guide RNAs" conveying the targeting specificity to Cas9 enzymes, and being responsible for guiding the nuclease enzyme to a specific target and introducing a double strand break there. The use of the term gRNA is very much a generalisation, and each CRISPR type has peculiarities relating to the structure, mechanics, and how the gRNA is generated.
For instance, in wild type systems, crRNAs provide the targeting, whereas in genome editing CRISPR systems, a synthetic small gRNA (or sgRNA) serves this role.
Wild-type Type II CRISPR Systems
In wild type II CRISPR systems, the guide RNA is referred to as a CRISPR targeting RNA, or crRNA, and it is transcribed from the DNA sequences known as protospacers, ~20 base pair long sections of foreign DNA separated by a short palindromic repeats in the bacterial genome. To create the crRNA, the entire protospacer array is transcribed as pre-crRNA, and is then cut up into individual crRNAs by a trans-activating crRNA (tracrRNA) that has a sequence complementary to the palindromic repeat. When the tracrRNA hybridizes to the short palindromic repeat, it triggers processing by the bacterial double-stranded RNA-specific ribonuclease, RNase III. The pre-crRNA is then cut into crRNAs, which bind to the tracrRNA, which binds to the Cas9 nuclease, which then becomes activated and specific to the DNA sequence complimentary to the crRNA.
Modified Type II CRISPR Systems for Genome Editing
The main difference between genome editing CRISPR systems and wildtype systems is that the sgRNA is a single construct; there is no need to have both the crRNA and a tracrRNA. This is because the protospacer and the scaffold are transcribed as a single unit, resulting in a single sgRNA complex that has the functionality of the tracrRNA (it binds to the Cas9), and the protospacer (it targets a specific region of DNA).
This makes genome editing CRISPR systems a little easier to use, as the number of steps required to build the functional ribonucleoprotein is reduced in the cell. Further, unlike wildtype systems where the bacterial genome is running the code for the CRISPR system off of its own genome, in this case, the code is executed from a vector, which is inserted into the cell by a scientist (see above).
Simply put, the DNA for the Cas9 is transcribed and translated by the cell, and the protein complexes with the guides that are expressed from the protospacers encoded onto the plasmid/vector.
Generally, genome editing CRISPR applications involve a single Cas9 enzyme from Streptococcus pyogenes known as SpCas9 (with a "NGG" PAM recognition sequence), along with a single synthetic guide RNA (sgRNA). However, different Cas9 enzymes can be used, each with their own PAM sequences, and many CRISPR vectors can in fact have more than one protospacer sequence on it, meaning that multiple genes can be targeted by a single vector that expresses a Cas9 and multiple sgRNAs.
More details of designing of sgRNA targeting a specific region of DNA can be found here.