01 시작

The dependencies page lists all the jars that you will need to have in your classpath.

The class com.gargoylesoftware.htmlunit.WebClient is the main starting point. This simulates a web browser and will be used to execute all of the tests.

Most unit testing will be done within a framework like JUnit so all the examples here will assume that we are using that.

In the first sample, we create the web client and have it load the homepage from the HtmlUnit website. We then verify that this page has the correct title. Note that getPage() can return different types of pages based on the content type of the returned data. In this case we are expecting a content type of text/html so we cast the result to an com.gargoylesoftware.htmlunit.html.HtmlPage.


@Test

public void homePage() throws Exception {

    final WebClient webClient = new WebClient();

    final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");

    Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText());

 

    final String pageAsXml = page.asXml();

    Assert.assertTrue(pageAsXml.contains("<body class=\"composite\">"));

 

    final String pageAsText = page.asText();

    Assert.assertTrue(pageAsText.contains("Support for the HTTP and HTTPS protocols"));

 

    webClient.closeAllWindows();

}


Often you will want to simulate a specific browser. This is done by passing a com.gargoylesoftware.htmlunit.BrowserVersion into the WebClient constructor. Constants have been provided for some common browsers but you can create your own specific version by instantiating a BrowserVersion.


@Test

public void homePage_Firefox() throws Exception {

    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17);

    final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");

    Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText());

 

    webClient.closeAllWindows();

}


Specifying this BrowserVersion will change the user agent header that is sent up to the server and will change the behavior of some of the JavaScript.

Once you have a reference to an HtmlPage, you can search for a specific HtmlElement by one of 'get' methods, or by using XPath.

Below is an example of finding a 'div' by an ID, and getting an anchor by name:


@Test

public void getElements() throws Exception {

    final WebClient webClient = new WebClient();

    final HtmlPage page = webClient.getPage("http://some_url");

    final HtmlDivision div = page.getHtmlElementById("some_div_id");

    final HtmlAnchor anchor = page.getAnchorByName("anchor_name");

 

    webClient.closeAllWindows();

}


XPath is the suggested way for more complex searches, a brief tutorial can be found in W3Schools


@Test

public void xpath() throws Exception {

    final WebClient webClient = new WebClient();

    final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");

 

    //get list of all divs

    final List<?> divs = page.getByXPath("//div");

 

    //get div which has a 'name' attribute of 'John'

    final HtmlDivision div = (HtmlDivision) page.getByXPath("//div[@name='John']").get(0);

 

    webClient.closeAllWindows();

}


The last WebClient constructor allows you to specify proxy server information in those cases where you need to connect through one.


@Test

public void homePage_proxy() throws Exception {

    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_10, "http://myproxyserver", myProxyPort);

 

    //set proxy username and password

    final DefaultCredentialsProvider credentialsProvider = (DefaultCredentialsProvider) webClient.getCredentialsProvider();

    credentialsProvider.addCredentials("username", "password");

 

    final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");

    Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText());

 

    webClient.closeAllWindows();

}


Specifying this BrowserVersion will change the user agent header that is sent up to the server and will change the behavior of some of the JavaScript.


Frequently we want to change values in a form and submit the form back to the server. The following example shows how you might do this.


@Test

public void submittingForm() throws Exception {

    final WebClient webClient = new WebClient();

 

    // Get the first page

    final HtmlPage page1 = webClient.getPage("http://some_url");

 

    // Get the form that we are dealing with and within that form,

    // find the submit button and the field that we want to change.

    final HtmlForm form = page1.getFormByName("myform");

 

    final HtmlSubmitInput button = form.getInputByName("submitbutton");

    final HtmlTextInput textField = form.getInputByName("userid");

 

    // Change the value of the text field

    textField.setValueAttribute("root");

 

    // Now submit the form by clicking the button and get back the second page.

    final HtmlPage page2 = button.click();

 

    webClient.closeAllWindows();

}


 

02 키보드 사용

For a given WebClient, the focus can be on at most one element at any given time. Focus doesn't have to be on any element within the WebClient.

There are several ways to move the focus from one element to another. The simplest is to call HtmlPage.setFocusedElement(HtmlElement). This method will remove focus from whatever element currently has it, if any, and will set it to the new component. Along the way, it will fire off any "onfocus" and "onblur" handlers that have been defined.

The element currently owning the focus can be determined with a call to HtmlPage.getFocusedElement().

To simulate keyboard navigation via the tab key, you can call HtmlPage.tabToNextElement() and HtmlPage.tabToPreviousElement() to cycle forward or backwards through the defined tab order. This tab order is defined by the tabindex attribute on the various elements as defined by the HTML specification. You can query the defined tab order with the method HtmlPage.getTabbableElements() which will return a list of all tabbable elements in defined tab order.

Access keys, often called keyboard mnemonics, can be simulated with the method HtmlPage.pressAccessKey(char).

To use special keys, you can use htmlElement.type(int) with KeyboardEvent.DOM_VK_PAGE_DOWN.

Finally, there is an assertion for testing that will verify that every tabbable element has a defined tabindex attribute. This is done with WebAssert.assertAllTabIndexAttributesSet().


 

03 테이블 사용

The first set of examples will use this simple html.


<html><head><title>Table sample</title></head><body>

    <table id="table1">

        <tr>

            <th>Number</th>

            <th>Description</th>

        </tr>

        <tr>

            <td>5</td>

            <td>Bicycle</td>

        </tr>

    </table>

</body></html>


This example shows how to iterate over all the rows and cells


final HtmlTable table = page.getHtmlElementById("table1");

for (final HtmlTableRow row : table.getRows()) {

    System.out.println("Found row");

    for (final HtmlTableCell cell : row.getCells()) {

        System.out.println("   Found cell: " + cell.asText());

    }

}


The following sample shows how to access specific cells by row and column


final WebClient webClient = new WebClient();

final HtmlPage page = webClient.getPage("http://foo.com");

 

final HtmlTable table = page.getHtmlElementById("table1");

System.out.println("Cell (1,2)=" + table.getCellAt(1,2));


The next examples will use a more complicated table that includes table header, footer and body sections as well as a caption


<html><head><title>Table sample</title></head><body>

    <table id="table1">

        <caption>My complex table</caption>

        <thead>

            <tr>

                <th>Number</th>

                <th>Description</th>

            </tr>

        </thead>

        <tfoot>

            <tr>

                <td>7</td>

                <td></td>

            </tr>

        </tfoot>

        <tbody>

            <tr>

                <td>5</td>

                <td>Bicycle</td>

            </tr>

        </tbody>

        <tbody>

            <tr>

                <td>2</td>

                <td>Tricycle</td>

            </tr>

        </tbody>

    </table>

</body></html>


HtmlTableHeader, HtmlTableFooter and HtmlTableBody sections are groupings of rows. There can be at most one header and one footer but there may be more than one body. Each one of these contains rows which can be accessed via getRows()


final HtmlTableHeader header = table.getHeader();

final List<HtmlTableRow> headerRows = header.getRows();

 

final HtmlTableFooter footer = table.getFooter();

final List<HtmlTableRow> footerRows = footer.getRows();

 

for (final HtmlTableBody body : table.getBodies()) {

    final List<HtmlTableRow> rows = body.getRows();

    ...

}

Every table may optionally have a caption element which describes it.

final String caption = table.getCaptionText()


 

04 프레임(frame / iframe)사용

Getting the page inside <frame> element or <iframe> element can be done by using HtmlPage.getFrames().
Suppose you have the following page:

<html>
  <body>
    <iframe src="two.html">
  </body>
</html>

You can use the following code to get the content of two.html:

final List<FrameWindow> window = page.getFrames();
final HtmlPage pageTwo = (HtmlPage) window.get(0).getEnclosedPage();

Another example that navigates API docs to get a desired page of a class:

final WebClient client = new WebClient();
final HtmlPage mainPage = client.getPage("http://htmlunit.sourceforge.net/apidocs/index.html");

To get the page of the first frame (at upper left) and click the sixth link:

final HtmlPage packageListPage = (HtmlPage) mainPage.getFrames().get(0).getEnclosedPage();
packageListPage.getAnchors().get(5).click();

To get the page of the frame named 'packageFrame' (at lower left) and click the second link:

final HtmlPage pakcagePage = (HtmlPage) mainPage.getFrameByName("packageFrame").getEnclosedPage();
pakcagePage.getAnchors().get(1).click();

To get the page of the frame named 'classFrame' (at right):

final HtmlPage classPage = (HtmlPage) mainPage.getFrameByName("classFrame").getEnclosedPage();


 

05 윈도우 사용

All pages are contained within WebWindow objects. This could be a TopLevelWindow representing an actual browser window, an HtmlFrame representing a <frame> element or an HtmlInlineFrame representing an <iframe> element.

When a WebClient is first instantiated, a TopLevelWindow is automatically created. You could think of this as being the first window displayed by a web browser. Calling WebClient.getPage(WebWindow, WebRequest) will load the new page into this window.

The JavaScript open() function can be used to load pages into other windows. New WebWindow objects will be created automatically by this function.


If you wish to be notified when windows are created or pages are loaded, you need to register a WebWindowListener with the WebClient via the method WebClient.addWebWindowListener(WebWindowListener)

When a window is opened either by JavaScript or through the WebClient, a WebWindowEvent will be fired and passed into the WebWindowListener.webWindowOpened(WebWindowEvent) method. Note that both the new and old pages in the event will be null as the window does not have any content loaded at this point. If a URL was specified during creation of the window then the page will be loaded and another event will be fired as described below.

When a new page is loaded into a specific window, a WebWindowEvent will be fired and passed into the WebWindowListener.webWindowContentChanged(WebWindowEvent) method.


 

06 JavaScript 사용

A frequent question we get is "how do I test my JavaScript?". There is nothing really specific for using JavaScript, it is automatically processed. So, you just need to .getPage(), find the element to click(), and then check the result. Tests for complex JavaScript libraries are included in HtmlUnit test base, you can find it here which is useful to get an idea.

Usually, you should wait() or sleep() a little, as HtmlUnit can finish before the AJAX response is retrieved from the server, please read this FAQ.

Below are some examples:


Lets say that we have a page containing JavaScript that will dynamically write content to the page. The following html will dynamically generate five textfields and place them inside a table. Each textfield will have a unique name created by appending the index to the string "textfield".

<html><head><title>Table sample</title></head><body>
    <form action='/foo' name='form1'>
    <table id="table1">
        <script type="text/javascript">
            for (i = 1; i <= 5; i++) {
                document.write("<tr><td>" + i
                    + "</td><td><input name='textfield" + i
                    + "' type='text'></td></tr>");
            }
        </script>
    </table></form>
</body></html>

We would likely want to test that the five text fields were created so we could start with this.

@Test
public void documentWrite() throws Exception {
    final WebClient webClient = new WebClient();
 
    final HtmlPage page = webClient.getPage("http://myserver/test.html");
    final HtmlForm form = page.getFormByName("form1");
    for (int i = 1; i <= 5; i++) {
        final String expectedName = "textfield" + i;
        Assert.assertEquals(
            "text", 
            form.<HtmlInput>getInputByName(expectedName).getTypeAttribute());
    }
}

We might also want to check off-by-one errors by ensuring that it didn't create "textfield0" or "textfield6". Trying to get an element that doesn't exist will cause an exception to be thrown so we could add this to the end of the previous test.

try {
    form.getInputByName("textfield0");
    fail("Expected an ElementNotFoundException");
}
catch (final ElementNotFoundException e) {
    // Expected path
}
 
try {
    form.getInputByName("textfield6");
    fail("Expected an ElementNotFoundException");
}
catch (final ElementNotFoundException e) {
    // Expected path
}

Often you want to watch alerts triggered by JavaScript.

<html><head><title>Alert sample</title></head>
<body onload='alert("foo");'>
</body></html>

Alerts are tracked by an AlertHandler which will be called whenever the JavaScript alert() function is called. In the following test, we register an alert handler which just saves all messages into a list. When the page load is complete, we compare that list of collected alerts with another list of expected alerts to ensure they are the same.

@Test
public void alerts() throws Exception {
    final WebClient webClient = new WebClient();
 
    final List collectedAlerts = new ArrayList();
    webClient.setAlertHandler(new CollectingAlertHandler(collectedAlerts));
 
    // Since we aren't actually manipulating the page, we don't assign
    // it to a variable - it's enough to know that it loaded.
    webClient.getPage("http://tciludev01/test.html");
 
    final List expectedAlerts = Collections.singletonList("foo");
    Assert.assertEquals(expectedAlerts, collectedAlerts);
}

Handling prompt dialogs, confirm dialogs and status line messages work in the same way as alerts. You register a handler of the appropriate type and it will get notified when that method is called. See WebClient.setPromptHandler(), WebClient.setConfirmHandler() and WebClient.setStatusHandler() for details on these.

Most event handlers are already implemented: onload, onclick, ondblclick, onmouseup, onsubmit, onreadystatechange, ... They will be triggered at the appropriate time just like in a "real browser".

If the event that you wish to test is not yet supported then you can directly invoke it through the ScriptEngine. Note that while the script engine is publicly accessible, we do not recommend using it directly unless you have no other choice. It is much better to manipulate the page as a user would by clicking on elements and shifting the focus around.


 

07 ActiveX 사용

Although HtmlUnit is a pure Java implementation that simulates browsers, there are some cases where platform-specific features require integration of other libraries, and ActiveX is one of them.

Internet Explorer on Windows can run arbitrary ActiveX components (provided that security level is lowered on purpose, if the user trusts the website). Neither HtmlUnit nor Internet Explorer has any control on the behavior of the run ActiveX, so you have to be careful before using that feature.


The current implementation depends on Jacob, and because it has .dll dependency, it was not uploaded to maven repository. The dependency is optional, i.e. Jacob jar is not needed for compiling or usual usage of HtmlUnit.

To use Jacob, add jacob.jar to the classpath and put the .dll in the path (java.library.path) so that the following code works for you:

final ActiveXComponent activeXComponent = new ActiveXComponent("InternetExplorer.Application");
final boolean busy = activeXComponent.getProperty("Busy").getBoolean();
System.out.println(busy);

The only thing needed is setting WebClient property:

webClient.getOptions().setActiveXNative(true);

and there you go!

 

Posted by 장안동베짱e :