Copyright © Cay S. Horstmann 2015
This work is licensed under a Creative Commons Attribution 4.0 International License
val head = <head><title>Fred's Memoirs</title></head>Has type
scala.xml.Elem
val items = <li>Fred</li><li>Wilma</li>Is a
scala.xml.NodeSeq
x <y
xml"..."
instead.scala.xml.Node
superclass of all XML node typesElem
and Text
e
is an Elem
, e.label
is the tag namee.child
is the NodeSeq
of children (it's child, not children, like in XPath)NodeSeq
is a subtype of Seq[Node]
with support for XPath-like opsNode
extends NodeSeq
as a sequence of length 1NodeBuffer
val items = new scala.xml.NodeBuffer items += <li>Fred</li> items += <li>Wilma</li>
NodeBuffer
is converted to NodeSeq
, don't mutate it any more!image.attributes
gets a map-like key/value association
val image = <img src="logo.jpg" alt="San José State University Logo"/> val src = image.attributes("src")
NodeSeq
because it may contain EntityRef
image.attributes.toAttrMap
, which yields a Map[String, String]
<ul><li>{items(0)}</li><li>{items(1)}</li></ul>
NodeSeq
are spliced into the XML. Everything else wrapped into Atom[T]
String
becomes Atom[String]
, not Text
<ul>{items.map(i => <li>{i}</li>)}</ul>
{ }
inside XML, double them:
<h1>The Natural Numbers {{1, 2, 3, ...}}</h1>
<img src={makeURL(fileName)}/>
<img src="{makeURL(fileName)}"/>does not execute the expression in
{ ... }
<table> {(1 to 10) map (i => <tr> {(1 to 10) map (j => <td>{ i * j }</td>)} </tr>)} </table>
NodeSeq
whose child
contains 100 elementsElem
whose child
is a NodeSeq
of length 10Elem
whose child
is a NodeSeq
of length 100Elem
whose descendants include 100 Text
nodes\
and \\
with NodeSeq
like /
and //
in XPath
val list = <dl><dt>Java</dt><dd>Gosling</dd>
<dt>Scala</dt><dd>Odersky</dd></dl>
val languages = list \ "dt"
// Yields a NodeSeq of <dt>Java</dt>
and <dt>Scala</dt>
doc \ "body" \ "_" \ "li"
\\
locates descendants at any depth:
doc \\ "img"
@
locates attributes:
img \ "@alt"
(<img src="hamster.jpg"/><img src="frog.jpg"/> \\ "@src").text
"hamster.jpgfrog.jpg"
<src>hamster.jpg</src><src>frog.jpg</src>
NodeSeq("hamster.jpg", "frog.jpg")
n match { case <img/> => ... }
img
with any attributes and no childrencase <li>{_}</li> => ...
case <li>{_*}</li> => ...
case <li>{child}</li> => child.text case <li>{Text(item)}</li> => item case <li>{children @ _*}</li> => children.map(...)
children
is a Seq[Node]
and not a NodeSeq
ul
to ol
:
import scala.xml._ import scala.xml.transform._ val rule1 = new RewriteRule { override def transform(n: Node) = n match { case e @ <ul>{_*}</ul> => e.asInstanceOf[Elem].copy(label = "ol") case _ => n } }
RuleTransformer
:
val transformed = new RuleTransformer(rule1, rule2, rule3).transform(root);
copy
makes a copy of an Elem
, modifying the tag or children:
list.copy(child = list.child ++ <li>Another item</li>)
%
to change an attribute:
image % Attribute(null, "alt", "An image of a hamster", Null)
val root = scala.xml.XML.loadFile("myfile.xml")
ConstructingParser
preserves comments, CDATA sections, and optionally white space:
import scala.xml.parsing._ val parser = ConstructingParser.fromFile(new File("myfile.xml"), preserveWS = true) val root = parser.document.docElem
val parser = new XhtmlParser(scala.io.Source.fromFile("myfile.html")) val doc = parser.initialize.document
XML.save("myfile.xml", root) XML.write(writer, root)Each has different options—check the scaladoc
unit11
. Right-click on the project in the tree display at the left, select Properties → Java Build Path → Libraries → Add External JARs, then locate to the directory where the Scala IDE is installed, go to its plugins
directory, and pick org.scala-lang.modules.scala-xml_1.0.2.jar
.images.sc
. Now read a web page:
import scala.xml._ import scala.xml.parsing._ val source = scala.io.Source.fromURL("http://horstmann.com/index.html") val parser = new XhtmlParser(source) val doc = parser.initialize.documentWhat do you get?
\\
, find all img
tags. How many are they?src
attributes. (Hint: \\
returns a NodeSeq
.)null
come from? (If you can't find it, don't worry—there will be a homework assignment to help.)What did the loop of the preceding problem do? Print all characters in the string s *Print the string s in reverse order Print every other character in the string s Count the number of characters in sYou'd like them to look like this:
class
and onclick
attributes need to be put where?
script
and link
tags at the beginning of these slides to see how the JS/CSS are included.quiz.txt
into some known location, such as ~/Downloads
with the sample question above.unit11
project, make a worksheet quizwriter.sc
:
import scala.xml._ val source = scala.io.Source.fromFile("~/Downloads/quiz.txt", "UTF-8") val lines = source.getLines.toVector val i = lines.lastIndexWhere(_.trim.length == 0) val questionText = lines.take(i).map(l => <p>{l}</p>) val doc = <html>...</html>
XML.save("~/Downloads/quiz.html", doc)Look at the file in the browser. Does it work?
<body> <h1>Main heading</h1> Introduction <h2>Subheading</h2> Contents <h2>Another Subheading</h2> More Contents </body>But the contents of a subsection was not nested in the contents of a section, unless you chose to include it in a
div
. HTML 5 suggests that you use the section
element to mark the semantics:
<body> <h1>Main heading</h1> Introduction <section> <h2>Subheading</h2> Contents </section> <section> <h2>Another Subheading</h2> More Contents </section> </body>We'd like to achieve that transformation. Why is it harder than the example of replacing all
ul
with ol
?
child
node sequence of body
. We could loop over it, but that wouldn't be the functional way. Instead, we'd like to group it into a sequence of sequences:
h2
h2
up to, but not including, the secondh2
to the endsection
.
Seq
. Which ones produce a sequence, iterator, or map of sequences?
def splitWhere[T](xs: Seq[T])(p: (T) => Boolean): Seq[Seq[T]]so we can call
val splits = splitWhere(body.child)(...) val intro = splits(0) val section1 = <section>{splits(1)}</section>What do you need for the ...?
splitWhere
for testing. Don't code like that! You'll get to produce a nicer version in the homework.
def splitWhere[T](xs: Seq[T])(p: (T) => Boolean): Seq[Seq[T]] = { // Don't code like this! val result = new scala.collection.mutable.ArrayBuffer[scala.collection.mutable.ArrayBuffer[T]]() result += new scala.collection.mutable.ArrayBuffer[T]() for (x <- xs) { if (p(x)) result += new scala.collection.mutable.ArrayBuffer[T]() result.last += x } result }You'll also need to figure out how to get to the body. Remember that
doc \\ "body"
is a NodeSeq
.
val sections = splits.tail.map(s => ...) val newBody = ...